Live Updated 2026-04

Mystia Assistant

A self-contained dashboard that turns scattered Touhou Mystia's Izakaya reference data into searchable bilingual tables.

Astro React TypeScript Tailwind

Problem

The game data needed for decision-making was spread across multiple sources — wiki pages, forum posts, Reddit discussions, and other community notes. Those sources were often incomplete, inconsistent, hard to search quickly, and not structured for fast in-game reference.

On top of that, some of the useful information existed in both English and Japanese contexts, which made quick lookup even more annoying.

Goal

Build a structured knowledge base that could collect data from multiple sources, normalize it into a consistent shape, preserve bilingual usefulness, avoid destructive updates when enrichment happens later, and expose it through a fast searchable UI.

Merge and Enrichment Strategy

The project was treated as a data reconciliation problem, not just a UI problem. Staged enrichment was used so later sources augment records instead of blindly replacing them. Previously known-good information is preserved instead of trusting the newest crawl by default. The final output is structured around rapid lookup, not around mirroring source structure.

Key Engineering Challenges

  • Crawling multiple external sources with different formats and quality levels
  • Extracting semi-structured data into a clean internal model
  • Preserving trust hierarchy between sources
  • Preventing later pipeline steps from overwriting higher-quality data gathered earlier
  • Keeping the transformation pipeline incremental instead of destructive
  • Normalizing bilingual English/Japanese values for display and search

Failure Modes

The risky part of this type of pipeline is silent corruption. If a later source is lower quality but still writes over earlier fields, the system gets “updated” while actually becoming less correct.

That means pipeline order and merge rules matter as much as extraction itself.

Lessons Learned

This project forced careful thinking around provenance, trust levels, non-destructive merging, and normalization boundaries. The hardest bugs were not crashes — they were silently incorrect data that only became visible during actual use.