Skip to content

Level 2: CodeQL & entrypoints

Level 1 gives you a symbol table and a call graph resolved by the TypeScript checker, with RTA and phantom nodes. Level 2 is the enrichment layer on top of it: CodeQL-derived edges for the dynamic cases the checker can’t reach, and framework entrypoint detection. This page describes the design and is honest about what’s implemented.

The TypeScript checker resolves static call structure precisely, but some edges are invisible to it — dynamic dispatch through values, reflection-like patterns, and dataflow that crosses call boundaries indirectly. CodeQL sees those. The design:

  1. Build a CodeQL database for the project (codeql database create --language=javascript-typescript), cached under the analysis cache directory.
  2. Run a call-graph query that emits caller/callee locations.
  3. Map each result row back to a signature via the same signatureOf canonicalizer the rest of the analyzer uses, so CodeQL endpoints land in the same identity space as level-1 edges.
  4. Merge the CodeQL edges into the level-1 graph, keyed by (source, target) — summing weights and unioning provenance, so an edge both engines saw carries ["codeql", "tsc"].

Only edges whose endpoints exist in the symbol table would be emitted, preserving the level-1 no-dangling invariant.

flowchart LR
    L1["level-1 graph
provenance: tsc / import"] --> M[merge by source,target]
    DB["CodeQL database"] --> Q["call-graph query"]
    Q --> MAP["map rows → signatures
via signatureOf"]
    MAP --> CE["CodeQL edges
provenance: codeql"]
    CE --> M
    M --> CG[enriched call_graph]

The merge step is real and exercised today — it just receives an empty CodeQL edge list. It deduplicates by (source, target): weights sum, provenance lists union (and sort), and tags merge. So once the query lands, an edge confirmed by both the checker and CodeQL will surface with both provenance tokens, and consumers can weigh edges by how many engines agreed.

An entrypoint is a function the framework calls that your own code never calls directly — an HTTP route handler, a message consumer, a CLI command. Static call-graph analysis can’t see these edges (the framework wires them at runtime), so without help those handlers look like dead code and “where does execution enter?” is unanswerable.

The schema already carries the result type, TSEntrypoint:

FieldMeaning
signatureThe TSCallable.signature this entrypoint refers to.
frameworkThe framework that dispatches it (e.g. "nestjs", "express").
detection_sourceHow it was found — decorator, base_class, convention, extension, … (open vocabulary).
route_path, http_methodsFor HTTP routes.
source_fileThe file declaring the binding.
tagsFree-form, namespaced metadata for extensions.

The symbol table is already shaped to make detection tractable: TSCallable carries is_entrypoint and entrypoint_framework flags, decorators are captured with resolved qualified names and raw argument fragments (so a @Get('/users') route path is recoverable), and parameter decorators (@Param('id')) are recorded. Detection itself — the finders that populate entrypoints[framework] — is the level-2 work that remains.

You can pass the flag — the pipeline accepts it and the artifact shape is final — but expect level-1 results:

Terminal window
codeanalyzer-typescript --input ./my-ts-project --output ./out --analysis-level 2
# warns: CodeQL enrichment not implemented; emits no extra edges

That makes --analysis-level 2 safe to wire into a pipeline now: when enrichment lands, the same command starts returning richer graphs without any schema change on your side.