How I put the Italian Constitution and codes into a knowledge graph

6/17/20264 min read
How I put the Italian Constitution and codes into a knowledge graph

Direct answer: for Open·Parlamento I ingested the Italian Constitution and seven codes (criminal, civil, civil procedure, road, consumer, private insurance) into a knowledge graph, article by article — about 6,445 articles — each with a stable ELI identifier for citations, linked by 86,672 authoritative amendment relations extracted from Normattiva in Akoma Ntoso format. No paraphrase: the text is verbatim, the relations are deterministic (confidence 1.0). Here’s how, with the numbers and the honest limits.

TL;DR

  • What: Constitution (139 articles) + 7 codes = ~6,445 articles, full text per article.
  • How it’s cited: each article has an ELI ID, e.g. eli:/it/codice-penale/art/575, so a citation resolves to the exact article.
  • Relations: 86,672 edges (amends, repeals, replaces, inserts, converts, extends) from Normattiva (Akoma Ntoso), confidence 1.0, created_by: normattiva.
  • Honesty: many referenced statutes aren’t in the corpus yet — the graph’s closure over relations is still partial.

Why a graph and not a PDF

A code is already text. Why turn it into a graph? Because the useful questions aren’t «what does article 575 say», but «what changed, who changed it, which statute amends it». Those answers live in the relations between statutes, not in isolated text. A knowledge graph makes those relations first-class: nodes (the articles) and typed edges (the amendments), navigable and citable.

And above all: an AI agent can answer reliably only if every statement resolves to a real article, not to an invented paraphrase. That’s why the text goes in verbatim.

What’s inside (the numbers)

Source Articles Notes
Constitution 139 full text
Criminal Code 734
Civil Code ~3,000 the largest
Civil Procedure Code 831
Road Code 240
Consumer Code 146
Private Insurance Code 355
Total ~6,445 per article, with ELI

The method: per-article, with ELI

Each article becomes a standalone document in the graph, with a natural-prose header embedding the code, the article number and the ELI identifier for citation recovery. In practice:

Criminal Code, Article 575 (ELI identifier: eli:/it/codice-penale/art/575).

[verbatim text of the article]

The ELI (European Legislation Identifier) is the key: a stable, standard ID that lets you cite the exact article and link it to other statutes unambiguously. A question about homicide resolves, end-to-end, to «Article 575 of the Criminal Code» — with the reference, not a generic sentence.

The authoritative relations: 86,672 edges, zero invention

The heart of the graph isn’t guessed by an LLM. The relations between statutes are extracted deterministically from Normattiva, which publishes texts in Akoma Ntoso with <activeModifications> blocks (what this statute modifies) and <passiveModifications> (who modifies this statute). From there we derive typed edges:

  • repeals — repeal (the absolute majority of edges)
  • amends — generic amendment
  • inserts — insertion of new provisions
  • replaces — text replacement
  • converts — decree→law conversion
  • extends — deadline extension

Each edge carries the evidence text attached, created_by: normattiva and confidence 1.0. These are facts, not inferences. On top of this base you can add AI-inferred edges (impacts on rights and sectors), but always labeled as such and with visible confidence: honesty about provenance first.

The limits, declared

An honest graph declares what it does not cover. Today:

  • Relations often point to statutes referenced but not yet present in the corpus: the graph’s closure over edges is still partial. We’re expanding the corpus to reduce the «dangling» nodes.
  • The corpus covers the Constitution + main codes; not all legislation in force.
  • AI-inferred edges are complementary and declared: the authoritative base remains Normattiva’s.

Declaring the limits isn’t a weakness: it’s the difference between a tool that cites the source and one that pretends to know everything.

FAQ

What is an ELI identifier?

ELI (European Legislation Identifier) is a European standard for giving each statute — and each article — a stable, resolvable identifier. It lets you cite the exact article and link statutes to one another unambiguously, even across jurisdictions.

Are the amendment relations generated by AI?

No. The authoritative relations (amends, repeals, replaces…) are extracted deterministically from Normattiva’s Akoma Ntoso metadata, with confidence 1.0. Any AI-inferred relations are kept separate, labeled, and given lower confidence.

How many articles are in the graph?

About 6,445 articles: the Constitution (139) plus seven codes (criminal, civil, civil procedure, road, consumer, private insurance), all ingested per-article with an ELI identifier.


We build knowledge graphs that cite the source, not replace it. See how Open·Parlamento works and its open data, or let’s talk.

Knowledge GraphLegal TechAIOpen Data

Scritto da Giulio Garofalo