How I put the Italian Constitution and codes into a knowledge graph

Direct answer: for Open·Parlamento I ingested the Italian Constitution and seven codes (criminal, civil, civil procedure, road, consumer, private insurance) into a knowledge graph, article by article — about 6,445 articles — each with a stable ELI identifier for citations, linked by 86,672 authoritative amendment relations extracted from Normattiva in Akoma Ntoso format. No paraphrase: the text is verbatim, the relations are deterministic (confidence 1.0). Here’s how, with the numbers and the honest limits.
TL;DR
- What: Constitution (139 articles) + 7 codes = ~6,445 articles, full text per article.
- How it’s cited: each article has an ELI ID, e.g.
eli:/it/codice-penale/art/575, so a citation resolves to the exact article. - Relations: 86,672 edges (amends, repeals, replaces, inserts, converts, extends) from Normattiva (Akoma Ntoso), confidence 1.0,
created_by: normattiva. - Honesty: many referenced statutes aren’t in the corpus yet — the graph’s closure over relations is still partial.
Why a graph and not a PDF
A code is already text. Why turn it into a graph? Because the useful questions aren’t «what does article 575 say», but «what changed, who changed it, which statute amends it». Those answers live in the relations between statutes, not in isolated text. A knowledge graph makes those relations first-class: nodes (the articles) and typed edges (the amendments), navigable and citable.
And above all: an AI agent can answer reliably only if every statement resolves to a real article, not to an invented paraphrase. That’s why the text goes in verbatim.
What’s inside (the numbers)
| Source | Articles | Notes |
|---|---|---|
| Constitution | 139 | full text |
| Criminal Code | 734 | |
| Civil Code | ~3,000 | the largest |
| Civil Procedure Code | 831 | |
| Road Code | 240 | |
| Consumer Code | 146 | |
| Private Insurance Code | 355 | |
| Total | ~6,445 | per article, with ELI |
The method: per-article, with ELI
Each article becomes a standalone document in the graph, with a natural-prose header embedding the code, the article number and the ELI identifier for citation recovery. In practice:
Criminal Code, Article 575 (ELI identifier: eli:/it/codice-penale/art/575).
[verbatim text of the article]
The ELI (European Legislation Identifier) is the key: a stable, standard ID that lets you cite the exact article and link it to other statutes unambiguously. A question about homicide resolves, end-to-end, to «Article 575 of the Criminal Code» — with the reference, not a generic sentence.
The authoritative relations: 86,672 edges, zero invention
The heart of the graph isn’t guessed by an LLM. The relations between statutes are extracted deterministically from Normattiva, which publishes texts in Akoma Ntoso with <activeModifications> blocks (what this statute modifies) and <passiveModifications> (who modifies this statute). From there we derive typed edges:
repeals— repeal (the absolute majority of edges)amends— generic amendmentinserts— insertion of new provisionsreplaces— text replacementconverts— decree→law conversionextends— deadline extension
Each edge carries the evidence text attached, created_by: normattiva and confidence 1.0. These are facts, not inferences. On top of this base you can add AI-inferred edges (impacts on rights and sectors), but always labeled as such and with visible confidence: honesty about provenance first.
The limits, declared
An honest graph declares what it does not cover. Today:
- Relations often point to statutes referenced but not yet present in the corpus: the graph’s closure over edges is still partial. We’re expanding the corpus to reduce the «dangling» nodes.
- The corpus covers the Constitution + main codes; not all legislation in force.
- AI-inferred edges are complementary and declared: the authoritative base remains Normattiva’s.
Declaring the limits isn’t a weakness: it’s the difference between a tool that cites the source and one that pretends to know everything.
FAQ
What is an ELI identifier?
ELI (European Legislation Identifier) is a European standard for giving each statute — and each article — a stable, resolvable identifier. It lets you cite the exact article and link statutes to one another unambiguously, even across jurisdictions.
Are the amendment relations generated by AI?
No. The authoritative relations (amends, repeals, replaces…) are extracted deterministically from Normattiva’s Akoma Ntoso metadata, with confidence 1.0. Any AI-inferred relations are kept separate, labeled, and given lower confidence.
How many articles are in the graph?
About 6,445 articles: the Constitution (139) plus seven codes (criminal, civil, civil procedure, road, consumer, private insurance), all ingested per-article with an ELI identifier.
We build knowledge graphs that cite the source, not replace it. See how Open·Parlamento works and its open data, or let’s talk.
Related articles
- I downloaded EU law and added it to the Italian-law graphFrom the EU Publications Office bulk dump (56,292 acts) I extracted and linked thousands of European statutes — regulations, directives, decisions — to the Italian-law graph, with CELEX and ELI. Here’s the method, the real numbers and what’s still missing.
- Italian Parliament graph: why the Chamber and Senate don’t speak the same languageThe Italian Chamber and Senate publish great open data but with two different ontologies (OCD and OSR): surname, legislature, dates and relations are modeled in incompatible ways. Here are the problems — and how we solved them building a single graph.
- Open data in Italy: open on paper, unusable in practiceItalian public data is «open» by law but often badly kept: dated portals, endpoints answering empty, broken certificates, malformed links, abandoned civic projects. A documented analysis, source by source, from someone who actually had to integrate them.