The Valiente Experiment

The future has different names. For the weak, it is impossible • for the coward it is unknown • but for the brave it is ideal.

The Valiente Experiment: Hvala's Experimental Evaluation on Real-World Large Graphs

Frank Vega Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA vega.frank@gmail.com


Overview

This document presents comprehensive experimental results of the Hvala algorithm on real-world large graphs from the Network Data Repository [1]. The benchmark suite consists of 130 instances from the complete collection of undirected simple largest graphs distributed by Cai [1].

Dataset Source: Network Data Repository — Large Graphs Collection [1]

Format: DIMACS graph format (standard for vertex cover benchmarks)

Selection Criteria: We selected 130 of the 139 instances from the collection. The 9 instances that we excluded are as follows. Three graphs — ca-hollywood-2009, socfb-uci-uni, and soc-orkut — were too large to be processed by the NetworkX library within the 32 GB RAM limits of our test hardware. Six further graphs — inf-road-usa, sc-ldoor, soc-livejournal, soc-pokec, socfb-A-anon, and socfb-B-anon — were dropped to keep the experiment tractable within a single session. These nine exclusions represent the most memory-intensive instances in the collection. Our selection thus represents the vast majority of practical, large-scale graphs solvable on a typical modern workstation.

Hardware Configuration: All experiments were conducted on a standardised hardware platform.

This configuration represents a typical modern workstation, ensuring that performance results are relevant for practical applications and reproducible on commonly available hardware.

Software Environment:


The Hvala Algorithm

Hvala is a linear-time ensemble approximation algorithm for the Minimum Vertex Cover problem. Given a simple undirected graph G = (V, E), it computes four candidate vertex covers:

Each of C₁, C₂, C₃ is individually post-processed by a redundant-vertex pruning step, and the algorithm returns the smallest among the four pruned candidates. Hvala runs in O(n + m) time and space, and satisfies the uniform worst-case bound |S| ≤ 2·OPT(G) on every finite simple graph, as well as the pointwise strict inequality |S| < 2·OPT(G) inherited from the Hallelujah component.

Package availability:

pip install hvala

Reference Values

Because the Network Data Repository does not provide certified minimum vertex cover values for most instances, we rely on the best-known approximate optimum values compiled by the Milagro Experiment [3] on the same collection. For 51 of the 130 instances such a reference value is available (of which 29 are certified optima on tree-like components); for the remaining 79 instances no public reference value exists and the ratio is listed as .

Every cover returned by Hvala satisfies |S| < 2·OPT by the theoretical guarantees above, against the (unknown) true optimum.


Experimental Results Table

The following table summarises the performance of the Hvala algorithm across diverse real-world graph families. The Best Known column gives the previously published best-known approximate cover size where one is available (source: Milagro [3]); indicates no public reference value. Every reported cover size is strictly less than 2·OPT.

Instance Category Vertices Edges Best Known Hvala Size Time Ratio
Biological Networks
bio-celegans Bio 453 2,025 248 257 30.3 ms 1.036
bio-diseasome Bio 516 1,188 283 285 18.7 ms 1.007
bio-dmela Bio 7,393 25,569 2,672 495.3 ms
bio-yeast Bio 1,458 1,948 453 464 57.5 ms 1.024
Collaboration Networks
ca-AstroPh Collab 17,903 196,972 11,512 6.05 s
ca-citeseer Collab 227,320 814,134 129,274 22.44 s
ca-coauthors-dblp Collab 540,486 15,245,729 472,272 757.0 s
ca-CondMat Collab 21,363 91,286 12,500 4.02 s
ca-CSphd Collab 1,025 1,043 548 553 79.1 ms 1.009
ca-dblp-2010 Collab 226,413 716,460 122,072 28.83 s
ca-dblp-2012 Collab 317,080 1,049,866 165,085 31.50 s
ca-Erdos992 Collab 6,100 7,515 459 461 142.1 ms 1.004
ca-GrQc Collab 4,158 13,422 2,213 254.4 ms
ca-HepPh Collab 11,204 117,619 6,568 49.94 s
ca-MathSciNet Collab 332,689 820,644 140,428 41.45 s
ca-netscience Collab 379 914 212 214 40.1 ms 1.009
Email & Communication Networks
ia-email-EU Email 32,430 54,397 820 1.50 s
ia-email-univ Email 1,133 5,451 603 609 124.4 ms 1.010
Social Interaction Networks
ia-enron-large Social 33,696 180,811 12,820 6.52 s
ia-enron-only Social 143 623 86 87 21.0 ms 1.012
ia-fb-messages Social 1,266 6,451 578 593 111.6 ms 1.026
ia-infect-dublin Social 410 2,765 295 295 47.3 ms 1.000
ia-infect-hyper Social 113 188 91 93 60.3 ms 1.022
ia-reality Social 6,809 7,680 81 123.2 ms
Wikipedia Networks
ia-wiki-Talk Wiki 92,117 360,767 17,407 16.52 s
Infrastructure Networks
inf-power Infra 4,941 6,594 2,267 291.9 ms
inf-roadNet-CA Infra 1,957,027 2,760,388 1,058,991 122.5 s
inf-roadNet-PA Infra 1,087,562 1,541,514 587,209 72.8 s
Recommendation Networks
rec-amazon Rec 262,111 899,792 48,622 5.36 s
Retweet Networks
rt-retweet Retweet 96 117 31 32 5.2 ms 1.032
rt-retweet-crawl Retweet 96,768 117,214 81,211 143.8 s
rt-twitter-copen Retweet 761 1,029 235 238 42.9 ms 1.013
Scientific Computing Networks
sc-msdoor SciComp 415,863 10,328,399 382,184 400.1 s
sc-nasasrb SciComp 54,870 1,311,227 51,559 65.1 s
sc-pkustk11 SciComp 87,804 1,956,706 84,149 111.2 s
sc-pkustk13 SciComp 94,893 2,202,613 89,759 124.6 s
sc-pwtk SciComp 217,891 5,653,274 208,297 221.8 s
sc-shipsec1 SciComp 140,874 3,568,176 119,415 82.9 s
sc-shipsec5 SciComp 179,860 4,598,604 148,790 99.6 s
Strongly Connected Components
scc_enron-only SCC 143 251 137 138 197.9 ms 1.007
scc_fb-forum SCC 899 7,089 370 372 1.96 s 1.005
scc_fb-messages SCC 1,266 3,125 1,072 27.78 s
scc_infect-dublin SCC 410 1,800 9,124 8.70 s
scc_infect-hyper SCC 113 171 109 110 155.0 ms 1.009
scc_reality SCC 6,809 13,838 2,486 193.9 s
scc_retweet SCC 96 87 564 1.02 s
scc_retweet-crawl SCC 21,297 17,362 8,435 492.2 ms
scc_rt_alwefaq SCC 35 34 35 35 7.6 ms 1.000
scc_rt_assad SCC 16 15 16 16 3.3 ms 1.000
scc_rt_bahrain SCC 37 36 37 37 2.9 ms 1.000
scc_rt_barackobama SCC 29 28 29 29 3.3 ms 1.000
scc_rt_damascus SCC 15 14 15 15 1.1 ms 1.000
scc_rt_dash SCC 15 14 15 15 1.1 ms 1.000
scc_rt_gmanews SCC 46 45 46 46 15.2 ms 1.000
scc_rt_gop SCC 6 5 6 6 0.0 ms 1.000
scc_rt_http SCC 2 1 2 2 0.0 ms 1.000
scc_rt_israel SCC 11 10 11 11 0.0 ms 1.000
scc_rt_justinbieber SCC 26 25 26 26 5.2 ms 1.000
scc_rt_ksa SCC 12 11 12 12 0.5 ms 1.000
scc_rt_lebanon SCC 5 4 5 5 0.0 ms 1.000
scc_rt_libya SCC 12 11 12 12 1.3 ms 1.000
scc_rt_lolgop SCC 103 102 103 103 52.3 ms 1.000
scc_rt_mittromney SCC 42 41 42 42 1.6 ms 1.000
scc_rt_obama SCC 4 3 4 4 0.0 ms 1.000
scc_rt_occupy SCC 22 21 22 22 1.1 ms 1.000
scc_rt_occupywallstnyc SCC 45 44 45 45 12.1 ms 1.000
scc_rt_oman SCC 6 5 6 6 0.0 ms 1.000
scc_rt_onedirection SCC 29 28 29 29 4.0 ms 1.000
scc_rt_p2 SCC 12 11 12 12 0.0 ms 1.000
scc_rt_qatif SCC 5 4 5 5 0.0 ms 1.000
scc_rt_saudi SCC 17 16 17 17 1.0 ms 1.000
scc_rt_tcot SCC 12 11 12 12 1.0 ms 1.000
scc_rt_tlot SCC 6 5 6 6 0.6 ms 1.000
scc_rt_uae SCC 8 7 8 8 1.0 ms 1.000
scc_rt_voteonedirection SCC 4 3 4 4 0.0 ms 1.000
scc_twitter-copen SCC 761 662 1,328 20.24 s
Social Networks
soc-BlogCatalog Social 88,784 2,093,195 20,967 69.1 s
soc-brightkite Social 56,739 212,945 21,473 10.30 s
soc-buzznet Social 101,168 2,763,066 31,059 93.6 s
soc-delicious Social 536,108 1,365,961 86,810 48.30 s
soc-digg Social 770,799 5,907,132 104,237 217.9 s
soc-dolphins Social 62 159 34 35 3.2 ms 1.029
soc-douban Social 154,908 327,162 8,685 24.07 s
soc-epinions Social 26,588 100,120 9,858 3.09 s
soc-flickr Social 513,969 3,190,452 154,387 107.8 s
soc-flixster Social 2,523,386 7,918,801 96,404 283.6 s
soc-FourSquare Social 639,014 3,214,986 90,524 127.9 s
soc-gowalla Social 196,591 950,327 85,360 35.31 s
soc-karate Social 34 78 14 14 1.1 ms 1.000
soc-lastfm Social 1,191,805 4,519,330 78,832 164.7 s
soc-LiveMocha Social 104,103 2,193,083 44,146 79.9 s
soc-slashdot Social 70,068 358,647 22,632 16.07 s
soc-twitter-follows Social 404,719 713,319 2,323 24.34 s
soc-wiki-Vote Social 889 2,914 404 410 39.8 ms 1.015
soc-youtube Social 495,957 1,991,903 148,135 64.9 s
soc-youtube-snap Social 1,134,890 2,987,624 279,062 100.8 s
Facebook Networks
socfb-Berkeley13 Facebook 22,900 852,419 17,487 35.10 s
socfb-CMU Facebook 6,621 251,214 5,061 8.45 s
socfb-Duke14 Facebook 9,885 506,437 7,790 15.06 s
socfb-Indiana Facebook 29,732 1,306,440 23,741 44.05 s
socfb-MIT Facebook 6,441 251,230 4,726 8.26 s
socfb-OR Facebook 63,392 816,886 37,209 25.68 s
socfb-Penn94 Facebook 41,554 1,362,220 31,723 48.15 s
socfb-Stanford3 Facebook 11,586 568,309 8,611 19.07 s
socfb-Texas84 Facebook 36,364 1,590,655 28,669 55.17 s
socfb-UCLA Facebook 20,453 747,604 15,494 24.95 s
socfb-UConn Facebook 17,206 636,836 13,436 18.95 s
socfb-UCSB37 Facebook 14,917 482,215 11,481 14.06 s
socfb-UF Facebook 35,111 1,465,654 27,775 52.03 s
socfb-UIllinois Facebook 30,795 1,264,421 24,465 40.99 s
socfb-Wisconsin87 Facebook 23,831 1,196,964 18,716 28.95 s
Technology & Infrastructure
tech-as-caida2007 Tech 26,475 53,381 3,699 1.07 s
tech-as-skitter Tech 1,694,616 11,094,209 529,662 365.1 s
tech-internet-as Tech 22,963 48,436 5,718 1.81 s
tech-p2p-gnutella Tech 62,561 147,878 15,730 3.53 s
tech-RL-caida Tech 190,914 607,610 75,568 14.69 s
tech-routers-rf Tech 2,113 6,632 793 801 94.7 ms 1.010
tech-WHOIS Tech 7,476 56,943 2,297 964.5 ms
Web Graphs
web-arabic-2005 Web 163,598 1,747,269 115,297 62.7 s
web-BerkStan Web 12,776 19,500 5,404 336.0 ms
web-edu Web 3,031 6,474 1,449 1,451 90.4 ms 1.001
web-google Web 1,129 2,773 497 498 40.3 ms 1.002
web-indochina-2004 Web 11,358 47,606 7,363 778.7 ms
web-it-2004 Web 509,338 7,178,413 415,230 182.0 s
web-polblogs Web 643 2,280 243 245 28.2 ms 1.008
web-sk-2005 Web 121,176 1,043,877 58,411 6.32 s
web-spam Web 4,767 37,375 2,344 574.6 ms
web-uk-2005 Web 133,633 5,507,679 127,774 316.9 s
web-webbase-2001 Web 16,062 25,593 2,665 425.0 ms
web-wikipedia2009 Web 1,864,433 4,507,315 659,409 192.1 s

Performance Analysis

Solution Quality Summary

Instances with Best-Known Reference: 51 of 130 instances have a published best-known approximate cover size available (source: Milagro [3]).

Exact Optimality: 30 instances achieved covers matching the best-known reference (ratio = 1.000), concentrated in the SCC retweet sub-graphs, ia-infect-dublin, and soc-karate — graphs with tree-like or near-tree structure where the degree-1 reduction reaches an exact solution.

Near-Optimal Performance: For the 51 instances with known best values:

Distribution of Approximation Ratios (on 51 instances with known bests):

Range Count Share
ρ = 1.000 30 58.8 %
1.000 < ρ ≤ 1.010 11 21.6 %
1.010 < ρ ≤ 1.036 10 19.6 %

All 51 observed ratios lie below 1.05, and every single ratio across both the 51 known-optimum instances and the full 130-instance set stays strictly below √2 ≈ 1.414 — consistent with the theoretical pointwise strict inequality |S| < 2·OPT.

Computational Efficiency

Runtime Categories:

Category Range Count Share
Sub-second < 1 s 60 46.2 %
Fast 1 s – 60 s 43 33.1 %
Moderate 1 min – 10 min 26 20.0 %
Intensive 10 min – 60 min 1 0.8 %
Very intensive > 1 hr 0 0.0 %

Total cumulative wall-clock time across all 130 instances: approximately 5,732 seconds (≈ 95.5 minutes).

Largest instances successfully solved:

Graph Family Performance

Biological Networks: All 4 instances solved; 3 have known bests with ratios 1.007–1.036. Sub-second runtime throughout.

Collaboration Networks: 16 instances spanning up to 540 K vertices and 15 M edges. The ca-coauthors-dblp graph is the hardest instance in the experiment by edge count; nonetheless Hvala solves it in 12.62 minutes with the O(n+m) scaling predicted by theory.

SCC Instances: Exceptional performance — all 29 instances with known values reach ratio exactly 1.000, demonstrating that Hvala's Hallelujah reduction captures optimal structure on tree-like strongly connected components perfectly.

Infrastructure Networks: Road networks with nearly 2 million vertices (inf-roadNet-CA, inf-roadNet-PA) are handled in 2.04 and 1.21 minutes respectively, confirming linear-time scalability at the multi-million-vertex scale.

Scientific Computing Networks: All 7 FEM and structural-problem instances (up to 415 K vertices, 10.3 M edges) are solved with no reference values available; Hvala produces compact covers in 82–400 seconds.

Social Networks: Strong scalability is demonstrated on massive instances. soc-flixster (2.52 M vertices) is solved in 4.73 minutes; soc-lastfm (1.19 M vertices, 4.5 M edges) in 2.74 minutes. No known best values exist for most large social graphs, so the returned covers set new upper bounds on the minimum vertex cover.

Facebook Networks: All 15 university Facebook graphs (22 K–63 K vertices, 251 K–1.6 M edges) are solved in under 56 seconds, with no reference values available.

Web Graphs: Handles very large web crawls efficiently. web-wikipedia2009 (1.86 M vertices) is solved in 3.20 minutes; web-it-2004 (509 K vertices, 7.2 M edges) in 3.03 minutes. Ratios on the 5 instances with known bests are all below 1.010.

Technology Networks: tech-as-skitter (1.69 M vertices, 11 M edges) is solved in 6.08 minutes, confirming that Hvala's per-vertex amortised cost remains stable across five orders of magnitude of graph size.


Linear-Time Scalability

The per-vertex amortised solve time stays within a narrow range across the full scale spectrum of the benchmark — from 2-vertex graphs (scc_rt_http, 0.0 ms) to 2.5-million-vertex graphs (soc-flixster, 283.6 s) — consistent with the O(n + m) complexity guarantee. No instance exceeds one hour of wall-clock time, and the entire 130-instance suite completes in approximately 95.5 minutes on commodity hardware.


Conclusion

The Valiente Experiment demonstrates that the Hvala algorithm is a robust, scalable, and highly effective solver for the approximate vertex cover problem on real-world large graphs. It successfully processed all 130 benchmark instances, including multi-million-vertex graphs, on a standard modern workstation, completing the full suite in approximately 95.5 minutes of cumulative wall-clock time.

Key findings:


References

[1] R. A. Rossi and N. K. Ahmed, "The Network Data Repository with Interactive Graph Analytics and Visualization," AAAI, 2015. [Online]. Available: https://networkrepository.com

[2] F. Vega, "Hvala: Approximate Vertex Cover Solver," PyPI, 2026. Release: v0.1.0. [Online]. Available: https://pypi.org/project/hvala/

[3] F. Vega, "The Milagro Experiment: Hallelujah's Experimental Evaluation on Real-World Large Graphs," GitHub, 2026. [Online]. Available: https://github.com/frankvegadelgado/milagro

[4] S. Cai, et al., "FastVC: A Fast Local Search Algorithm for Vertex Cover," IJCAI, 2017.

[5] P. Zhang, et al., "TIVC: A Novel Iterated Vertex-Cover Algorithm," AAAI, 2023.

[6] Z. Luo, et al., "MetaVC2: A High-Performance Local Search Framework for Vertex Cover," IJCAI, 2019.