Milestone: thank you — Twinkle Hub reached 3,000 registered users in its first month.
To integrate with Anthropic Official Connectors and align our auth flow with the directory's requirements, new sign-ups are paused as of today. Existing users may also see brief API key disruptions during the cutover.
MCP is the open protocol for AI agents — Claude, OpenClaw, Hermes, your own agent loop. Twinkle Hub bundles Taiwan data sources behind a single MCP endpoint — set your client up once, AI can query datasets, run SQL, and call tools.
✓We rebuilt the drug-leaflet search back-end on top of a pre-built index. Three tools now answer in single-digit milliseconds: `opendata-search_drug_label` (precise leaflet-field search), `opendata-check_drug_interaction` (scan 2–10 drug leaflets for interactions), and the leaflet-merge stage inside `opendata-get_drug_details`
✓Warm-cache latency measured live: leaflet-field search 1.78 ms (down from 50–200 ms); two-drug interaction scan 0.27 ms
✓The drug-licence registry search (`opendata-search_drug` and friends) still uses the previous path and will move in the next release — this ship focuses the highest-traffic surface first
Try asking:
Drugs whose leaflet mentions both warfarin and NSAIDs (noticeably faster)
7 new health-domain MCP tools — ICD-10 / TFDA drug licences / drug leaflets / food nutrition
datasets
4
✓ICD-10-CM zh-TW 96,803 codes: `opendata-lookup_icd10` supports code prefix match and Chinese / English keyword fuzzy search (source: MOHW data.gov.tw 177507, OGDL v1)
✓TFDA full drug licence registry, 71,836 items: `opendata-search_drug` by name / indication / licence number / active-only; `opendata-get_drug_details` returns both administrative fields and structured leaflet fields in one call (source: data.gov.tw 9122, OGDL v1)
✓Drug leaflets 44,663 items structured into 6 fields (indications, contraindications, warnings, drug interactions, adverse reactions, dosage): `opendata-search_drug_label` searches inside leaflet fields directly; `opendata-check_drug_interaction` scans 2–10 drugs at once for interaction text (source: our own HF dataset `twinkle-ai/tw-drug-labels-vision`, CC-BY-4.0)
Try asking:
List every ICD-10 code for type-2 diabetes mellitus
TFDA licences containing Metformin — what are the licensed indications
2026-06-02v1.22
Faster semantic search for judicial, exam, and patent corpora
datasets
3
✓Outbound query embeddings now go through a dedicated managed endpoint — p99 latency near-zero on timeouts
✓Bulk preprocessing (chunk embed, backfill) split from outbound query traffic — no more outbound degradation during corpus backfill
✓Zero user-facing changes: tool signatures, response schema, and env vars are all unchanged
Try asking:
Find 2024 Kaohsiung District Court trademark-infringement rulings (noticeably faster)
Bar exam questions on "director liability"
Why this exists
Taiwan data is scattered across 100+ portals. AI app builders don't have time to wire each one.
Twinkle Hub bundles them into a single MCP service: discover datasets, fetch rows, join across tables, pay — all on one endpoint. Phase one is making data.gov.tw excellent; more sources (industry associations, local governments, commercial data) come next.
01
One MCP endpoint
Every source, every domain, every tool under the same URL. Configure your client once.
02
Deterministic billing
Fixed price per tool — no token-based gambling. Prepaid wallet, hard cutoff, no surprise bills.
03
License compliance
Original license metadata is passed through end-to-end so downstream apps don't trip license terms.
04
MCP-native protocol
tools/list / tools/call are native endpoints, not OpenAI tool-calling translated. Claude / OpenClaw / Hermes / Continue / Cline / your own agent loop — same wire format, no SDK, no adapter.
Why Twinkle Hub
Downloading 53,000 datasets is easy. Making them usable to AI is the hard part.
Making Taiwan open data usable to AI isn't a one-off catalog download. We handle the cleanup, conversion, classification, and querying — your AI just calls the API instead of crawling a hundred different portals.
52,960 / 19
53,000 datasets, sorted automatically
All 52,960 datasets on data.gov.tw, sorted into 19 categories. New ones land in the right place the next day — the index never ages out.
10+ formats
All the file formats, one shape
CSV, JSON, Excel, PDF, geo files — whatever the format, we read it and clean up the column names. Your AI doesn't choke just because one source is PDF and another is XLSX.
~50 ms
Ask a question, answer in 50 ms
Want your AI to query data on specific conditions? We have a query layer — about 50 ms across all 53,000 datasets. We don't dump a zip and tell you to process it yourself.
Daily refresh
Synced with the government every day
A daily cron pulls the latest catalog, picks up what's new or retired, runs classification. The data stays fresh — it doesn't go stale because we forgot to update.
Beyond the data
One thing we wired up alongside
OpenData is the main course. This one also matters for agents — we did it too.
20 official Anthropic Agent Skills. Load one into Claude Desktop / Claude Code / GitHub Copilot CLI / OpenAI Codex CLI — or any MCP-compatible agent — and it already knows which tool to call, which filter to use, and what a typical query looks like.
2026-05-31 internal benchmark — Phase 1 plan-only across 11 skill domains, Claude Opus 4.7 subagent A/B (incl. Round 2 narrow-domain: health/finance/environment/agriculture); Phase 2 real MCP (LVR query: "Taipei Da'an 2024 transactions over NT$50M"). Single-query sample; cold-start and retry costs not included. See full report for caveats.