This is an old revision of the document!
Table of Contents
Tutorials & Reviews
Cleaning Data
Taiyō Project: First Steps with Data
Molly Des Jardin
Digital Japanese Literature: Aozora Bunko (removing ruby from texts)
Christopher Morse
Encoding
Encodings of Japanese
Alexandre Elias
IIIF
Mapping
OCR & Kuzushiji Reading
An Introduction to the miwo kuzushiji app
ROIS-DS CODH
Cursive Japanese and OCR: Using KuroNet
James Harry Morris, The Digital Orientalist
Practicing Reading Cursive Japanese with Miwo
James Harry Morris, The Digital Orientalist
Google Docs and OCR: Some Experiments Transcribing Japanese Language Texts
James Harry Morris, The Digital Orientalist
Text Segmentation
Japanese Text Segmentation and Analysis with Web ChaMame
James Harry Morris, The Digital Orientalist
fugashi, a Tool for Tokenizing Japanese in Python
fugashi: A Tool for Japanese Tokenization
Paul McCann
Basic Python for Japanese Studies: Using fugashi for Text Segmentation
James Harry Morris, The Digital Orientalist
Tutorials on linguistic corpora (J)
National Institute for Japanese Language and Linguistics (国立国語研究所)
===== Text Mining =====
An Introduction to Japanese Text Mining
Mark Ravina (UT Austin)
Using Voyant Tools with Historical Japanese Texts
James Harry Morris, The Digital Orientalist
===== Webscraping =====
Crawling Aozora Bunko
Molly Des Jardin
Web Scraping with Python for Beginners**
James Harry Morris, The Digital Orientalist