Formal Evaluation of Cryptographic Schemes
— Let the data speak.

Coverage × Correctness × Reproducibility — from high-level intent to implementation artifacts, across five model capabilities.

See latest results Why choose C.F.B
9
7
5
100%

Why C.F.B

Multi-backend validation

ProVerif, Tamarin, Scyther, Maude-NPA, AVISPA, CryptoVerif, EasyCrypt — cross-tool verification for robust conclusions.

Design → Implementation

From protocol goals/assumptions to implementation-level artifacts (e.g., Jasmin/constant-time checks), end-to-end coverage.

Five capabilities

Interpretation / Generation/ Completion / Transformation / Correction — not just “write”, but “verify”.

Comprehensive dataset

Curated end-to-end from design intent to implementation, covering protocols/schemes, security properties (secrecy, auth, integrity, freshness, PFS, IND-CPA/CCA, equivalence), adversary models (Dolev–Yao, state/ephemeral leaks, CK/eCK).

Overall Leaderboard (recomputed)

Capability Radar

Coverage

Schemes/Protocols × Security properties × Adversary models × Formal languages
SecrecyAuthenticationIntegrityFreshness Forward SecrecyIND-CPA/CCAEquivalenceDolev–Yao CK/eCKProVerifTamarinScytherAVISPAMaude-NPACryptoVerifEasyCrypt

Evaluation Pipeline

  1. 1 Generate
  2. 2 Verify
  3. 3 Aggregate
  4. 4 Viz
  5. 5 Render

Fully scriptable & CI-friendly; scoring formulas and weights are configurable.

Get Started

Use scripts/cfb.sh to reproduce results and figures locally in one go.

Explore Results See Pipeline

Visualization of the Evaluation Process (Real-world Cases)

① Dataset Content (inputdata)
file
② Prompt (prompt)
③ Model Input (modelinput)
④ Model Output (modeloutput)
⑤ Eval Result (evalresult)

Video: Automatic Validation for Generation Task of Seven Formal Languages (grok-3 LLM)

▶ Open video (MP4)

Can’t play inline? Open in a new tab or Download the video.