Skip to content

Conversation

@cristianoc
Copy link
Collaborator

@cristianoccristianoc commented Dec 13, 2025

Summary

Add support for parallel processing of CMT files during dead code and exception analysis using OCaml 5 Domains.

Features

  • -parallel N: Process files using N parallel domains (0 = sequential, default)
  • -parallel -1: Auto-detect number of CPU cores using Domain.recommended_domain_count()
  • -timing: Report internal timing breakdown by analysis phase
  • Benchmark infrastructure: New test project for performance measurement

How It Works

The parallelization targets the CMT file processing phase, which is the main bottleneck (typically 75-80% of total analysis time).

Parallel execution model:

  1. Files are divided into chunks, one per domain
  2. Each domain processes its chunk independently, collecting results in local data structures
  3. When a domain finishes its chunk, it briefly acquires a mutex and merges its local results into shared state
  4. After all domains complete (Domain.join), the merged results are passed to the analysis phase

The mutex is only held during the merge operation (list concatenation), not during file processing, which minimizes contention.

The analysis and reporting phases remain sequential as they require merged cross-file data.

Testing

Correctness Tests

# Run regular reanalyze tests (sequential) make test-reanalyze # Run parallel mode tests (verifies parallel produces same results) make test-reanalyze-parallel

Benchmark Tests

# Full benchmark: generates 50 copies of test files (~5000 files), compiles, and times analysis make benchmark-reanalyze # Or with custom number of copies make benchmark-reanalyze COPIES=100

Observed Results (50 copies, ~4300 files, 12 cores)

Timing Breakdown

PhaseSequentialParallel (-1 = auto)Speedup
CMT processing0.90s (78%)0.45s (65%)2.0x
Analysis0.24s (21%)0.24s (35%)1.0x
Reporting0.001s0.001s-
Total1.15s0.69s1.7x

Key Observations

  1. CMT processing is the bottleneck - accounts for ~78% of total time
  2. Parallelization provides ~2x speedup for CMT processing phase
  3. Overall speedup of ~1.7x with 12 cores
  4. Same results: Both modes report identical issue counts (19,000 issues)

Usage Examples

# Sequential (default) reanalyze -config # Parallel with 4 domains reanalyze -config -parallel 4 # Auto-detect cores reanalyze -config -parallel -1 # With timing output reanalyze -config -parallel -1 -timing

Sample Timing Output

Using 12 parallel domains for 4900 files Analysis reported 19000 issues (...) === Timing === CMT processing: 0.450s (65.5%) Analysis: 0.240s (34.3%) Reporting: 0.001s (0.1%) Total: 0.691s 

Remove global ModulePath mutable state used during per-file AST traversal by threading ModulePath.t explicitly through traversal code. This is preparation for parallel per-file processing and reduces shared global state during MAP. Signed-Off-By: Cristiano Calcagno <[email protected]>
Record type dependency edges immediately instead of buffering them in a global ref and flushing at end-of-file. Note: this reorders debug output in reanalyze logs: addTypeReference lines now appear next to extendTypeDependencies, rather than being emitted in a later end-of-file flush. Signed-Off-By: Cristiano Calcagno <[email protected]>
Stop using a global TypeLabels hashtable during AST processing. Instead, compute type-label dependencies in a post-merge pass from merged Declarations and add the corresponding type-reference edges before solving. Fix: use raw decl positions (not declGetLoc/posAdjustment) when building cross-file label indices, since reference graph keys are raw positions. Note: debug output in deadcode expected logs is reordered due to moving label linking from per-file traversal to the post-merge pass. Signed-Off-By: Cristiano Calcagno <[email protected]>
Stop using global mutable exception declaration state during AST processing. Instead, derive a pure exception lookup from merged Declarations and use it when processing CrossFileItems exception refs. Signed-Off-By: Cristiano Calcagno <[email protected]>
- Replace global Values.valueBindingsTable with per-file values_builder - Replace global Checks.checks with per-file checks_builder - processCmt now returns file_result with per-file data - Add runChecks to process all checks after merging values tables - Update Reanalyze.ml to collect exception results and run checks at end - This enables future parallelization of AST processing
Add support for parallel processing of CMT files during dead code and exception analysis using OCaml 5 Domains. Features: - New -parallel N flag: process files using N domains (0 = sequential) - -parallel -1: auto-detect number of CPU cores - New -timing flag: report internal timing breakdown by phase - New benchmark infrastructure for performance testing The parallelization targets the CMT file processing phase, which is the main bottleneck (typically 75-80% of analysis time). The analysis and reporting phases remain sequential as they require merged cross-file data. Signed-Off-By: Cursor AI <[email protected]>
@cristianoccristianoc requested a review from zthDecember 13, 2025 05:51
Copy link

@chatgpt-codex-connectorchatgpt-codex-connectorbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Break down timing into sub-phases: - CMT processing: - File loading: time spent reading and traversing CMT files - Result collection: time spent merging results from domains - Analysis: - Merging: combining per-file data into global structures - Solving: running the liveness solver Signed-Off-By: Cursor AI <[email protected]>
@cristianoc
Copy link
CollaboratorAuthor

Updated Timing Breakdown (50 copies, ~4900 files, 12 cores)

Added granular timing that separates sub-phases:

Sequential Mode

=== Timing === CMT processing: 0.779s (77.3%) - File loading: 0.779s - Result collection: 0.000s Analysis: 0.227s (22.6%) - Merging: 0.081s - Solving: 0.146s Reporting: 0.001s (0.1%) Total: 1.007s 

Parallel Mode (auto-detect = 12 cores)

=== Timing === CMT processing: 0.422s (63.5%) - File loading: 0.422s - Result collection: 0.000s Analysis: 0.242s (36.4%) - Merging: 0.094s - Solving: 0.148s Reporting: 0.001s (0.1%) Total: 0.664s 

Key Observations

Sub-phaseSequentialParallelSpeedup
File loading0.779s0.422s1.85x
Result collection0.000s0.000s-
Merging0.081s0.094s0.86x
Solving0.146s0.148s0.99x
Total1.007s0.664s1.52x
  • Result collection is negligible - mutex contention is minimal
  • File loading is the parallelization target and shows 1.85x speedup
  • Analysis phases (merging, solving) are sequential and take similar time in both modes
  • The slight slowdown in merging for parallel mode is likely due to different list ordering

@cristianoc
Copy link
CollaboratorAuthor

cristianoc commented Dec 13, 2025

@cknitt to test, and eventually enable, parallelism, we need to build with ocaml 5.
CI currently also builds with older.
Thoughts?

@cknitt
Copy link
Member

@cknitt to test, and eventually enable, parallelism, we need to build with ocaml 5. CI currently also builds with older. Thoughts?

I kept the older versions in CI to make sure that we don't break compatibility inadvertently, so that we can support a broad version range when we eventually publish the compiler libs on OPAM.

But enabling parallelism is a good reason to move to OCaml 5 only, so this is fine with me.

@cristianoc
Copy link
CollaboratorAuthor

@cknitt to test, and eventually enable, parallelism, we need to build with ocaml 5. CI currently also builds with older. Thoughts?

I kept the older versions in CI to make sure that we don't break compatibility inadvertently, so that we can support a broad version range when we eventually publish the compiler libs on OPAM.

But enabling parallelism is a good reason to move to OCaml 5 only, so this is fine with me.

What's the earliest version we should keep?

@cknitt
Copy link
Member

What's the earliest version we should keep?

I would say the earliest we can based on what features/lib you require for parallelism.

- Update CI compatibility test to use OCaml 5.0.1 - Update all package dependencies in dune-project to require OCaml >= 5.0.1 - Update rescript.opam.template to require OCaml >= 5.0.1 - Remove redundant OCaml version check from syntax_benchmarks/dune
- Add 'set -e' to exit immediately on errors - Use 'opam exec --' to match checkformat.sh behavior - Now make format will fail visibly instead of silently continuing when formatting fails
@cristianoccristianocforce-pushed the reanalyze-reduce-global-state-ast branch from a55cdab to 452a4b1CompareDecember 15, 2025 04:16
@pkg-pr-new
Copy link

pkg-pr-newbot commented Dec 15, 2025

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript-lang/rescript@8089 

@rescript/darwin-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-arm64@8089 

@rescript/darwin-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-x64@8089 

@rescript/linux-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-arm64@8089 

@rescript/linux-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-x64@8089 

@rescript/runtime

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/runtime@8089 

@rescript/win32-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/win32-x64@8089 

commit: 8632e45

@cristianoc
Copy link
CollaboratorAuthor

What's the earliest version we should keep?

I would say the earliest we can based on what features/lib you require for parallelism.

OK it's 5.0.0.
PR ready to do go now.

zth
zth approved these changes Dec 15, 2025
@cristianoccristianocforce-pushed the reanalyze-reduce-global-state-ast branch from 1c05378 to 8632e45CompareDecember 15, 2025 13:13
@cristianoccristianoc enabled auto-merge (rebase) December 15, 2025 13:14
@cristianoccristianoc merged commit 699d02b into masterDec 15, 2025
25 checks passed
@cristianoccristianoc deleted the reanalyze-reduce-global-state-ast branch December 15, 2025 14:52
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

@cristianoc@cknitt@zth