datasets
This is an old revision of the document!
Table of Contents
Datasets
Please note that in the interest of space and clarity not every dataset available will be listed in the subcategories below. The Speech Resources Consortium page, for example, provides dozens of corpora, as does the Japan Data Catalog for the Humanities and Social Sciences. Please refer to their individual pages for more updated information on available datasets.
Repositories and Portals
Text Data
- Dataset of Premodern Japanese Text (text and page images)
- Dataset of Edo Cooking Recipes (text and page images)
- eStat (statistical information from government agencies)
- Statistics Japan (statistical information from the Statistics Bureau)
OCR Training
- KMNIST Dataset (kuzushiji)
- Dataset of Modern Magazines (includes 東洋学芸雑誌, 国民之友, 明六雑誌)
Maps/GIS
Image Data
IIIF
datasets.1654618779.txt.gz · Last modified: 2022/06/07 16:19 by prcurtis