Dieharder Entropy Scorecard

Headline Verdict: Both sources show concerns in Dieharder tests. System demonstrates stronger performance with 14 failed and 6 weak tests, compared to QSE's 24 failed and 10 weak tests. Failed tests (p < 0.0001 or p > 0.9999) indicate potential randomness weaknesses.

QSE Overall

Passed: 82/116

Weak: 10 • Failed: 24

System Overall

Passed: 96/116

Weak: 6 • Failed: 14

Wins by Test

15 QSE • 13 System

4 ties

Key Risk / Weakest Tests

Source	Weakest Test	P-value	Assessment
QSE	rgb_lagged_sum	0.00688252	PASSED
System	rgb_lagged_sum	0.00677909	PASSED

Assessment Criteria: Dieharder is a strength assessment tool, not a binary pass/fail test.

STRONG: No failed tests and <5% weak tests (some weak results are statistically expected with many tests)
REVIEW: No failed tests but ≥5% weak tests (may indicate statistical variation or require further investigation)
WEAK: One or more failed tests (p < 0.0001 or p > 0.9999) indicating potential randomness weaknesses

Test Reliability: Some Dieharder tests have known issues. Tests marked "Do Not Use" (e.g., diehard_sums) are automatically excluded. Tests marked "Suspect" (e.g., diehard_operm5) are included but flagged below.

Note: Dieharder evaluates statistical randomness. It does not alone certify cryptographic strength or "quantum resilience."

Per-Test Comparison (All Tests)

Test	QSE P-value	QSE Assessment	System P-value	System Assessment	Winner
dab_birthdays1	0.32422668	PASSED	0.22105501	PASSED	QSE
dab_bytedistrib	0.0	FAILED	0.0	FAILED	Tie
dab_dct	0.01423758	PASSED	0.52281197	PASSED	System
dab_filltree	0.68537192	PASSED	0.01268884	PASSED	QSE
dab_filltree2	0.00088211	WEAK	9e-08	FAILED	QSE
dab_monobit2	1.0	FAILED	1.0	FAILED	Tie
dab_opso2	0.0	FAILED	0.0	FAILED	Tie
diehard_2dsphere	0.8550421	PASSED	0.21198227	PASSED	System
diehard_3dsphere	0.73169093	PASSED	0.20702679	PASSED	QSE
diehard_birthdays	0.68948069	PASSED	0.68506382	PASSED	System
diehard_bitstream	0.64906889	PASSED	0.70955138	PASSED	QSE
diehard_count_1s_byt	0.65645361	PASSED	0.45952164	PASSED	System
diehard_count_1s_str	0.11219792	PASSED	0.92278747	PASSED	QSE
diehard_craps	0.00246887	WEAK	0.00020252	WEAK	QSE
diehard_dna	0.59557509	PASSED	0.98496265	PASSED	QSE
diehard_operm5 ⚠️	0.36033879	PASSED	0.8149558	PASSED	QSE
diehard_opso	0.26861769	PASSED	0.2340647	PASSED	QSE
diehard_oqso	0.90230451	PASSED	0.13813094	PASSED	System
diehard_parking_lot	0.37143226	PASSED	0.02719069	PASSED	QSE
diehard_rank_32x32	0.13840187	PASSED	0.22385803	PASSED	System
diehard_rank_6x8	0.12052741	PASSED	0.9981702	WEAK	QSE
diehard_runs	0.09303189	PASSED	0.80231838	PASSED	System
diehard_squeeze	0.02449221	PASSED	0.07693327	PASSED	System
marsaglia_tsang_gcd	0.0	FAILED	0.0	FAILED	Tie
rgb_bitdist	0.4951037	PASSED	0.51149519	PASSED	QSE
rgb_kstest_test	0.13269817	PASSED	0.72946678	PASSED	System
rgb_lagged_sum	0.06924377	PASSED	0.59383071	PASSED	System
rgb_minimum_distance	0.00825861	PASSED	0.25365625	PASSED	System
rgb_permutations	0.98330418	PASSED	0.56002545	PASSED	System
sts_monobit	0.19813119	PASSED	0.81913595	PASSED	QSE
sts_runs	0.9196863	PASSED	0.99291088	PASSED	QSE
sts_serial	0.59063634	PASSED	0.48763708	PASSED	System

⚠️ Tests marked with warning icon are "Suspect" per Dieharder documentation and may have known implementation issues.

⚠️ Suspect Tests (Known Issues)

Note: The following tests are marked as "Suspect" in Dieharder documentation due to known issues. For example, diehard_operm5 (-d 1) "seems to fail all generators in dieharder" and may have bugs in the original test implementation. Results from these tests should be interpreted with caution.

QSE Suspect Tests (1):

diehard_operm5 (p-value: 0.36033879)

System Suspect Tests (1):

diehard_operm5 (p-value: 0.8149558)

Understanding Dieharder Results

Dieharder vs NIST STS: Unlike NIST STS (which uses binary pass/fail criteria), Dieharder is a strength assessment tool that evaluates the distribution of p-values across multiple statistical tests. It is designed to "push a weak generator to unambiguous failure" rather than provide simple pass/fail results.

P-value Interpretation: For a truly random source, p-values should follow a uniform distribution between 0 and 1. Individual WEAK results (p < 0.005 or p > 0.995) are statistically expected with many tests, as approximately 1% of p-values should naturally fall in this range.

Comparison Methodology: When comparing two sources, the winner is determined by:

Fewer FAILED tests (p < 0.0001 or p > 0.9999) - most important indicator of randomness weaknesses
Fewer WEAK tests (if same number of failed tests) - secondary concern
More PASSED tests (if same failed/weak counts) - indicates overall strength
P-value distribution quality (if all counts equal) - p-values closer to 0.5 indicate more uniform distribution

Important: Individual test-by-test wins are less meaningful than overall failure counts. A source with fewer FAILED tests is stronger, regardless of individual test comparisons.

Test Reliability: Some Dieharder tests have known issues. Tests marked "Do Not Use" (e.g., diehard_sums) are automatically excluded from results. Tests marked "Suspect" (e.g., diehard_operm5) are included but flagged, as they may produce misleading results due to test implementation issues rather than generator weaknesses.

Assessment Levels:

PASSED: P-value between 0.005 and 0.995 (normal range)
WEAK: P-value between 0.0001-0.005 or 0.995-0.9999 (borderline, may indicate statistical variation)
FAILED: P-value < 0.0001 or > 0.9999 (extreme, indicates potential randomness weaknesses)

Overall Assessment: A source is considered STRONG if it has no FAILED tests and fewer than 5% WEAK tests. This accounts for expected statistical variation while flagging genuine concerns.

Recommended Next Steps

• Run multiple independent batches (e.g., 5 runs) with newly generated data for both sources.
• Increase sequences to 200–300 per run for stronger statistical confidence.
• Track stability: count how often any test hits WEAK or FAILED across runs.
• If FAILED tests appear consistently, investigate the specific test and entropy source.
• Archive all Dieharder reports and parameters for auditability.

— End of Report —