Wikimedia Germany Launches AI-Friendly Database to Expand Access to Wikipedia Knowledge
The database is publicly available on Toolforge, with a developer webinar scheduled for October 9.

Wikimedia Deutschland has introduced the Wikidata Embedding Project, a new database designed to make Wikipedia’s vast knowledge base more accessible to AI models.
Announced on October 1, the project uses vector-based semantic search, enabling computers to understand the meaning and relationships between words across more than 120 million Wikipedia and sister site articles.
"We want to create an infrastructure that enables everyone to develop generative AI applications based on verifiable, free and open data. This is an important step toward a digital world in which technologies for the benefit of society are not a footnote but the norm," Lydia Pintscher, Portfolio Lead at Wikimedia Deutschland, said.
Developed in partnership with Jina, a neural search company, and DataStax, a real-time data provider owned by IBM, the project also supports the Model Context Protocol (MCP)—a standard for communication between AI systems and data sources.
The Wikidata AI Project aims to support open-source AI/ML communities with a vector database that combines graph and semantic search for flexible applications. It enhances accessibility by enabling natural language queries and expands global reach with multilingual support across 100+ languages, fostering inclusivity and collaboration worldwide.
Unlike previous tools that relied on keyword searches or the SPARQL query language, the embedding-based approach is tailored for retrieval-augmented generation (RAG), helping large language models incorporate verified Wikipedia content into responses.
The system organises data semantically. For example, a search for “scientist” might return lists of notable nuclear scientists, translations of the term, related concepts like “researcher,” and images. This structured context improves the accuracy and depth of AI-driven queries.
The database is publicly available on Toolforge, with a developer webinar scheduled for October 9.
Comments ()