Schema documentation
Two tables make up the data product: insidereu.filings (MAR Article 19 PDMR) and insidereu.substantial_holdings (Transparency Directive / DTR-5 / national major-holding notifications).
Schema stable since 2025-Q4; ingested_at added in 2026-Q1. No field has been renamed or dropped since. Other tables in the insidereu schema are application-internal and not part of the data product.
Table: insidereu.filings
PDMR (Person Discharging Managerial Responsibilities) and closely-related-person insider transactions filed under Article 19 of EU MAR (Regulation 596/2014) and UK/EEA equivalents.
| Column | Type | Null | Meaning |
|---|---|---|---|
| id | text PK | no | Stable cross-scrape dedup key, format {country}-{source-system-ref}. Re-ingesting the same record is idempotent. |
| country | text | no | ISO 3166-1 alpha-2 of the regulator's jurisdiction. Not the issuer's domicile. |
| insider_id | text | no | Stable insider key within a country. Cross-country join not guaranteed. |
| insider_name | text | no | As reported by regulator. Nordic and other transliteration preserved. |
| insider_role | text | yes | E.g. Chief Executive Officer, Member of the Board, Closely associated person, or the literal PDMR when no sub-role is provided. |
| company_name | text | no | Issuer name. |
| isin | text | yes | Coverage by country varies — see the coverage matrix. When null, join on (country, company_name). |
| ticker | text | yes | Yahoo Finance ticker. Mainly populated for Tier 1-3 countries. |
| exchange | text | yes | Reporting venue. |
| transaction_date | date | no | Date the trade was executed. |
| filing_date | date | no | Regulator publication date. Currently unreliable for several countries (audit § 4d) where it's copied from transaction_date. |
| transaction_type | text | no | CHECK IN ('buy','sell','unknown'). 'unknown' = source feed had no parseable direction. Filter transaction_type IN ('buy','sell') for direction-confirmed rows only. |
| shares | numeric | yes | Volume of the trade. |
| price | numeric | yes | Price per share in the original currency. |
| value_eur | numeric | yes | Trade size in EUR. Derived: shares * price * fx_rate_on_transaction_date using ECB reference rates. |
| currency | text | yes | ISO 4217. EUR (default), CHF, SEK, GBP, NOK, DKK. |
| source_url | text | yes | Deep link to the regulator's original filing. Always populated for live-scraped rows. |
| raw_data | jsonb | yes | Source-system payload preserved verbatim. Used to re-derive normalized fields when needed. |
| scraped_at | timestamptz | yes | When the scraper first hit the source page. |
| current_price_eur | numeric | yes | Most recent close, refreshed nightly for active tickers. |
| price_refreshed_at | timestamptz | yes | When current_price_eur was last refreshed. |
| ingested_at | timestamptz | no | When the row landed in our DB. Use this for ingestion-velocity / latency analysis. |
Dedup recommendation
A server-side enrichment-aware view is shipped: read from insidereu.filings_dedup rather than the base table. It collapses two enriched rows from different sources with matching (country, issuer-key, txn_date, insider, direction, shares, price) and signal-only rows whose enriched sibling exists in another source. It does not collapse two signal-only rows with no enriched sibling. Same schema, no extra columns. The public REST API and sample CSV exports already read from the view.
Non-cash events
Option grants, RSU vests, and gifts are normalized: a grant or vest is recorded as buy with price = 0 and value_eur = 0; a gift-out is sell at price = 0. To exclude non-cash events, filter price > 0 AND value_eur > 0.
NULL ISIN handling
22 of 28 countries have <50% ISIN coverage. When isin IS NULL, join on (country, company_name) against a reference universe. Sample CSVs include an enriched_isin column derived by fuzzy-matching company name against the Yahoo Finance ticker universe at export time.
Sample row
{
"id": "SE-FI-2026-05-19-A4B92",
"country": "SE",
"insider_id": "SE-FI-INSIDER-7741",
"insider_name": "Erik Hägglund",
"insider_role": "Chief Executive Officer",
"company_name": "Saab AB",
"isin": "SE0000112385",
"ticker": "SAAB-B.ST",
"exchange": "Nasdaq Stockholm",
"transaction_date": "2026-05-16",
"filing_date": "2026-05-19",
"transaction_type": "buy",
"shares": 2000,
"price": 524.40,
"value_eur": 90876.12,
"currency": "SEK",
"source_url": "https://marknadssok.fi.se/.../A4B92.html",
"scraped_at": "2026-05-20T02:30:35.068+00:00",
"ingested_at": "2026-05-20T02:30:35.068+00:00"
}Table: insidereu.substantial_holdings
Transparency Directive (Directive 2004/109/EC) major-holding notifications and national equivalents — DTR 5 in the UK, BaFin §33-§38a WpHG in Germany, AFM in NL, CNMV in ES. Triggered when any holder crosses 3% / 5% / 10% / 15% / 20% / 25% / 30% / 50% / 75% of voting rights in a listed issuer.
| Column | Type | Null | Meaning |
|---|---|---|---|
| id | text PK | no | Stable per-source key. |
| country | text | no | ISO 3166-1 alpha-2 of the regulator that received the notification. |
| filer_name | text | no | Notifying entity (institutional holder, individual, syndicate). |
| filer_id | text | no | Stable filer key within a country. |
| issuer_name | text | no | The company whose voting rights crossed a threshold. |
| issuer_isin | text | yes | ISIN. Coverage by country varies — see audit. |
| issuer_ticker | text | yes | Yahoo Finance ticker (derived). |
| filing_date | date | no | Date the regulator published the notification. |
| event_date | date | yes | Date the threshold was crossed. Often 1-3 business days before filing_date. ES currently inverted (audit § 4a). |
| threshold_pct | numeric | yes | Threshold crossed, in percent. Currently 0% populated except NO (17%). Full population on the roadmap. |
| direction | text | no | CHECK IN ('crossed_above','crossed_below','disclosed'). ~99% of rows are disclosed today. Reliable directional values only for CH (98%) and NO (1.9%). |
| new_pct | numeric | yes | Filer's voting rights percentage after the event. |
| previous_pct | numeric | yes | Filer's voting rights percentage before the event. |
| voting_rights_count | bigint | yes | Absolute count of voting rights held after the event. |
| regulation | text | yes | The exact regulatory citation — e.g. Section 33 WpHG, DTR 5.1.2 R, AFM 5:38 Wft. |
| source_url | text | yes | Deep link to the original notification. |
| raw_data | jsonb | yes | Verbatim source payload. |
| created_at | timestamptz | no | Equivalent to filings.scraped_at. |
| ingested_at | timestamptz | no | When the row landed in our DB. Use for ingestion-velocity analysis. |
Dedup
Natural key: (country, filer_id, issuer_isin OR issuer_name, event_date, threshold_pct, direction). The id PK dedupes within a single source. No multi-source overlap observed today (each regulator publishes exactly once).
direction = 'disclosed'
Three regulators (FSMA-BE, AMF-FR, AFM-NL) publish “current holdings disclosure” notices that don't state whether the filer crossed up or down — they only restate the new position. Those rows carry direction = 'disclosed'. For strict directional whale events, filter direction <> 'disclosed'.
Sample row
{
"id": "DE-BAFIN-2026-05-19-77881",
"country": "DE",
"filer_name": "BlackRock, Inc.",
"filer_id": "DE-BAFIN-FILER-BLACKROCK",
"issuer_name": "Volkswagen AG",
"issuer_isin": "DE0007664039",
"issuer_ticker": "VOW3.DE",
"filing_date": "2026-05-19",
"event_date": "2026-05-15",
"threshold_pct": null,
"direction": "disclosed",
"new_pct": 5.12,
"previous_pct": 4.98,
"voting_rights_count": 15125400,
"regulation": "Section 33 WpHG",
"source_url": "https://www.bafin.de/.../77881.pdf",
"ingested_at": "2026-05-20T02:31:13.314+00:00"
}Known caveats — read before integrating
transaction_typeis ternary.'unknown'flags rows where the source did not carry parseable direction (signal-only). Filtertransaction_type IN ('buy','sell')for direction-confirmed rows only. Fully reliable buy/sell mix today: CH, DE, SE; smaller reliable tails: DK, FI, GB, IT, NO, ES.filing_datesemantics vary. For 18 countries it equalstransaction_date. For GB it equalsingested_at::date. Useingested_atfor “when we knew about it” andtransaction_datefor “when it happened.” Both are reliable.threshold_pctis sparsely populated in holdings. Until full backfill, the most reliable holdings signal is the changenew_pct - previous_pct.- ISIN coverage is non-uniform. Tier 1-2 countries are >99%; Tier 4 countries can be 0%. See per-country detail in the coverage matrix →
- Dedup is applied server-side via
insidereu.filings_dedup. Read from the view, not the base table. current_price_euris refreshed nightly for tickers with confirmed Yahoo Finance mapping. Coverage trackstickercoverage.- Non-cash events (grants, vests, gifts) are mapped to buy/sell with
price = 0. Filterprice > 0 AND value_eur > 0to exclude.
API access
Filings and holdings are available via three delivery surfaces:
- REST:
https://insidereu.com/api/v1/filings?country=DE&since=2026-05-01 - CSV / Parquet drop: nightly to S3-compatible bucket (pilot tier and above).
- Postgres read replica: enterprise tier.
API keys are provisioned per-seat via insidereu.api_keys with hashed key storage.