Open Knowledge Platform
This page is the contributor and API-user reference. For the narrative case-study with Middle-earth numbers, see the blog post. For the full developer design doc, see docs/external_metadata_sources.md in the repo.
What it is
GameEdu's catalogue — locations, characters, creatures, items, organisations, events, books, maps — is pulled from canonical fan-curated sources (Tolkien Gateway, Fandom wikis, Comic Vine, 5e.tools, L-Space, RPG Geek, BoardGameGeek, World Anvil, Sphero Edu API, OpenTTD TrueWiki, …) into Wagtail snippets. Every imported field is recorded with provenance — which source, when, by which handler version — and the raw scrape payload is preserved so editors can diff over time, replay a historical scrape, or roll back.
The three-table linkage layer
- MetadataCatalog — one row per external source. Slug, base URL, handler name. Seeded by
seed_metadata_catalogs. - MetadataEntry — one row per (target snippet, source). Generic-FK to any catalogue model, plus
external_id,last_scraped,last_status. - MetadataScrape — inline child of Entry. One row per fetch: status, raw payload JSON, handler version. Full audit history.
Entry is the join, Scrape is the history. The same character can have entries against Tolkien Gateway and against LotR Fandom, each with its own scrape series. Handlers decide which snippet fields to write back on apply(); the scrape payload always survives verbatim.
The metagraph
The setting page anchors everything: dedup is scoped to a setting, every catalogue snippet pivots through one via inline *Settings orderables. Cross-links between snippets (character → location, event → location, event → organisation, character → creature) are real FKs or inline orderables — queryable in one hop.
Supported sources
| Source | Domain | Auth | Status |
|---|---|---|---|
| Tolkien Gateway | tolkiengateway.net | none (Cloudflare via Flaresolverr) | handler + bulk sync |
| LotR Fandom | lotr.fandom.com | none (MediaWiki API) | handler + bulk sync |
| L-Space (Discworld) | wiki.lspace.org | none | handler + bulk sync (chars / locs / creatures / events) |
| Fandom — Discworld | discworld.fandom.com | none | fandom_full handler + fallback |
| Comic Vine | comicvine.gamespot.com | API key (free) | handler + sync_external |
| World Anvil | worldanvil.com | API token | handler + bulk seed |
| 5e.tools | 5e.tools | none (offline JSON mirror) | seed commands (books, creatures) |
| RPG Geek | rpggeek.com | scraping | sync_setting_rpggeek_metadata |
| BoardGameGeek | boardgamegeek.com | XMLAPI v2 | game_board |
| Scryfall (MTG) | scryfall.com | none | game_magic |
| Pokemon TCG | pokemontcg.io | API key | game_pokemon |
| TrueWiki (OpenTTD) | wiki.openttd.org | none (scrape) | truewiki_tools |
| Sphero Edu API | api.littlebits.com | API key | sphero_tools |
Releases
- r390–r396 — initial three-table design + Comic Vine handler +
sync_externalcommand +--fill-missingreplay mode + universal MetadataCatalog/Entry/Scrape replacesGameExternalReference. - r820 — full-body Fandom handler (
fandom_full) with sectionedhtml_intro, used by all current bulk-sync runs. - r849 — Tolkien Gateway handler (Cloudflare bypass via Flaresolverr) +
fandom-lotrcatalog seeded. - r851 —
sync_tolkien_characters+ generalised_wiki_category_syncmulti-wiki runner. - r852 —
sync_tolkien_locations+sync_tolkien_events+extra_applyhook for date-parse and location-FK attachment. - r853 —
GameCharacterLocationsinline orderable +backfill_character_locationscommand.
Pipeline first-end-to-end case study: Middle-earth in a weekend — 875 characters, 700 locations, 720 char↔location edges, 1,544 OK scrapes across two sources.
Adding a new source
- Catalog seed — append the source to
seed_metadata_catalogs.DEFAULTSwith its slug, handler name, and base URL. - Handler — drop
game_core/sync/<source>.pyimplementingfetch(entry) → payloadandapply(entry, payload). Decorate with@register. Import fromgame_core/sync/__init__.pyso registration runs at boot. - Importer command — for bulk ingest, add a
sync_<source>_<entity>management command that drives_wiki_category_sync.run_multi_wiki_sync(MediaWiki) or a custom walker. The engine handles setting-scoped dedup, collision suffixing, freshness skip, MetadataEntry / Scrape stamping, and the cross-link pass.
API
Catalogue snippets are exposed via Wagtail's GraphQL endpoint at /graphql/. Each MetadataScrape.payload is publicly readable JSON — use it to build third-party tools (campaign planners, fan-wiki cross-references, AI assistants grounded in canon) without re-scraping the original sources.
Import API for editors / external pipelines:
curl -X POST https://www.gameedu.eu/api/wagtail/import/ \
-H "X-API-Key: $API_KEY" -H "Content-Type: text/yaml" \
--data-binary @page.yaml