Passed: 82/116
Passed: 96/116
15 QSE • 13 System
| Source | Weakest Test | P-value | Assessment |
|---|---|---|---|
| QSE | rgb_lagged_sum | 0.00688252 | PASSED |
| System | rgb_lagged_sum | 0.00677909 | PASSED |
| Test | QSE P-value | QSE Assessment | System P-value | System Assessment | Winner |
|---|---|---|---|---|---|
| dab_birthdays1 | 0.32422668 | PASSED | 0.22105501 | PASSED | QSE |
| dab_bytedistrib | 0.0 | FAILED | 0.0 | FAILED | Tie |
| dab_dct | 0.01423758 | PASSED | 0.52281197 | PASSED | System |
| dab_filltree | 0.68537192 | PASSED | 0.01268884 | PASSED | QSE |
| dab_filltree2 | 0.00088211 | WEAK | 9e-08 | FAILED | QSE |
| dab_monobit2 | 1.0 | FAILED | 1.0 | FAILED | Tie |
| dab_opso2 | 0.0 | FAILED | 0.0 | FAILED | Tie |
| diehard_2dsphere | 0.8550421 | PASSED | 0.21198227 | PASSED | System |
| diehard_3dsphere | 0.73169093 | PASSED | 0.20702679 | PASSED | QSE |
| diehard_birthdays | 0.68948069 | PASSED | 0.68506382 | PASSED | System |
| diehard_bitstream | 0.64906889 | PASSED | 0.70955138 | PASSED | QSE |
| diehard_count_1s_byt | 0.65645361 | PASSED | 0.45952164 | PASSED | System |
| diehard_count_1s_str | 0.11219792 | PASSED | 0.92278747 | PASSED | QSE |
| diehard_craps | 0.00246887 | WEAK | 0.00020252 | WEAK | QSE |
| diehard_dna | 0.59557509 | PASSED | 0.98496265 | PASSED | QSE |
| diehard_operm5 ⚠️ | 0.36033879 | PASSED | 0.8149558 | PASSED | QSE |
| diehard_opso | 0.26861769 | PASSED | 0.2340647 | PASSED | QSE |
| diehard_oqso | 0.90230451 | PASSED | 0.13813094 | PASSED | System |
| diehard_parking_lot | 0.37143226 | PASSED | 0.02719069 | PASSED | QSE |
| diehard_rank_32x32 | 0.13840187 | PASSED | 0.22385803 | PASSED | System |
| diehard_rank_6x8 | 0.12052741 | PASSED | 0.9981702 | WEAK | QSE |
| diehard_runs | 0.09303189 | PASSED | 0.80231838 | PASSED | System |
| diehard_squeeze | 0.02449221 | PASSED | 0.07693327 | PASSED | System |
| marsaglia_tsang_gcd | 0.0 | FAILED | 0.0 | FAILED | Tie |
| rgb_bitdist | 0.4951037 | PASSED | 0.51149519 | PASSED | QSE |
| rgb_kstest_test | 0.13269817 | PASSED | 0.72946678 | PASSED | System |
| rgb_lagged_sum | 0.06924377 | PASSED | 0.59383071 | PASSED | System |
| rgb_minimum_distance | 0.00825861 | PASSED | 0.25365625 | PASSED | System |
| rgb_permutations | 0.98330418 | PASSED | 0.56002545 | PASSED | System |
| sts_monobit | 0.19813119 | PASSED | 0.81913595 | PASSED | QSE |
| sts_runs | 0.9196863 | PASSED | 0.99291088 | PASSED | QSE |
| sts_serial | 0.59063634 | PASSED | 0.48763708 | PASSED | System |
Note: The following tests are marked as "Suspect" in Dieharder documentation due to known issues. For example, diehard_operm5 (-d 1) "seems to fail all generators in dieharder" and may have bugs in the original test implementation. Results from these tests should be interpreted with caution.
QSE Suspect Tests (1):
System Suspect Tests (1):
Dieharder vs NIST STS: Unlike NIST STS (which uses binary pass/fail criteria), Dieharder is a strength assessment tool that evaluates the distribution of p-values across multiple statistical tests. It is designed to "push a weak generator to unambiguous failure" rather than provide simple pass/fail results.
P-value Interpretation: For a truly random source, p-values should follow a uniform distribution between 0 and 1. Individual WEAK results (p < 0.005 or p > 0.995) are statistically expected with many tests, as approximately 1% of p-values should naturally fall in this range.
Comparison Methodology: When comparing two sources, the winner is determined by:
Important: Individual test-by-test wins are less meaningful than overall failure counts. A source with fewer FAILED tests is stronger, regardless of individual test comparisons.
Test Reliability: Some Dieharder tests have known issues. Tests marked "Do Not Use" (e.g., diehard_sums) are automatically excluded from results. Tests marked "Suspect" (e.g., diehard_operm5) are included but flagged, as they may produce misleading results due to test implementation issues rather than generator weaknesses.
Assessment Levels:
Overall Assessment: A source is considered STRONG if it has no FAILED tests and fewer than 5% WEAK tests. This accounts for expected statistical variation while flagging genuine concerns.