I downloaded EU law and added it to the Italian-law graph

6/17/20264 min read
I downloaded EU law and added it to the Italian-law graph

Direct answer: Italian law doesn’t live alone — it transposes directives, implements regulations, refers to European rules. For Open·Parlamento I took the EU Publications Office RDF bulk dump (56,292 acts available), parsed it, and integrated ~2,312 European statutes into the knowledge graph — regulations, directives, decisions — each with CELEX and ELI identifiers. Full-text ingestion via an EUR-Lex/CELLAR connector is the next step. Here’s the method and the real numbers, without inflating them.

TL;DR

  • Source: EU Publications Office RDF bulk dump — 56,292 acts available.
  • Integrated in the graph: ~2,312 acts as nodes (43% regulations, plus decisions, directives…), time span 1964–2025.
  • Identity: each act has CELEX + ELI, to link it to the Italian statutes that implement or transpose it.
  • Honesty: it’s metadata + title for most; full integral text requires the EUR-Lex/CELLAR connector, in progress.

Why EU law belongs in the graph

A large slice of Italian law is, in fact, European law in disguise: a legislative decree that transposes a directive, a national rule that implements a regulation. If the graph stops at Italy’s borders, it loses half the story. Adding EU law lets you answer questions like «which directive does this decree transpose» or «what’s the European legal basis of this rule» — exactly the links that today take hours of manual work.

The source: the EU Publications Office bulk dump

The EU publishes its acts as an RDF dump (CDM, Common Data Model) through the Publications Office. It’s the right path: no scraping, a structured and standard dump. The dump I processed contained 56,292 legislative acts.

From there I extracted, for each act, the key metadata: CELEX identifier, title (in Italian where available, with English fallback), date, type (regulation/directive/decision) and issuing authority.

The real numbers (no overclaim)

Item Value
Acts available in the dump 56,292
Acts integrated in the graph ~2,312
Time span 1964–2025
Regulations ~43%
Decisions ~23%
Directives ~6%
Other (opinions, etc.) ~28%

I want to be honest, because it’s the project’s principle: I have not yet ingested the full text of all EU law. I integrated thousands of acts as graph nodes (metadata + title), and those nodes are already navigable and linkable to Italian statutes. The next step is the integral full text via a dedicated EUR-Lex/CELLAR connector (SPARQL + content-negotiation on CELLAR), which will bring the body of the acts into the semantic engine.

The method: RDF + CDM, not scraping

Parsing is done on RDF files with rdflib, querying the EU’s CDM ontology. The real (and declared) difficulties: a share of acts had malformed or incomplete RDF, some lacked an Italian title, others had partial metadata. That’s why the clean-extraction rate was around 46% — a number that tells the reality of institutional dumps, not the brochure.

The shared identity: CELEX and ELI

The real value isn’t «having» the EU acts, but linking them. Each European act has a CELEX (EUR-Lex’s classic ID) and an ELI: the same stable identifiers we use for Italian statutes. That’s what lets you trace an edge between an Italian legislative decree and the directive it transposes, or between a law and the EU regulation it implements — the heart of the «law ↔ law, no ambiguity» promise.

FAQ

Did you download all of EU law?

I downloaded the full EU Publications Office bulk dump (56,292 acts) and integrated ~2,312 as nodes in the graph. The integral full text of all acts requires the EUR-Lex/CELLAR connector, which is the next phase. I prefer to state the real numbers rather than say «all».

What is CELEX?

CELEX is the unique identifier EUR-Lex uses for every act of Union law. Together with ELI, it’s the stable key that lets you cite and link EU acts unambiguously, including to the national rules that implement them.

Why integrate the EU into the Italian-law graph?

Because much of Italian legislation transposes or implements European law. Without EU acts in the graph, the most important links are missing: which directive sits behind a decree, what the European legal basis of a rule is.


We link Italian and European statutes into a single citable graph — with stable identifiers, not paraphrases. See how Open·Parlamento works, or let’s talk.

Knowledge GraphEU LawOpen DataAI

Scritto da Giulio Garofalo