User Tools

Site Tools


tutorials

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
tutorials [2022/06/22 18:18] – [Text Segmentation] prcurtistutorials [2022/12/10 17:09] – [Text Segmentation] prcurtis
Line 8: Line 8:
 **[[http://darthcrimson.org/digital-japanese-literature-aozora-bunko/|Digital Japanese Literature: Aozora Bunko]]** (removing ruby from texts) **[[http://darthcrimson.org/digital-japanese-literature-aozora-bunko/|Digital Japanese Literature: Aozora Bunko]]** (removing ruby from texts)
 [[https://experiencing.art/|Christopher Morse]] [[https://experiencing.art/|Christopher Morse]]
 +
 +===== Dictionaries for Word Segmentation =====
 +
 +**[[https://www.dampfkraft.com/nlp/japanese-tokenizer-dictionaries.html|An Overview of Japanese Tokenizer Dictionaries]]**
 +[[https://www.dampfkraft.com/|Paul McCann]]
  
 ===== Encoding ===== ===== Encoding =====
Line 24: Line 29:
 [[https://www.mstavros.com/home|Matthew Stavros]] [[https://www.mstavros.com/home|Matthew Stavros]]
  
 +**[[https://digitalorientalist.com/2021/11/09/i-just-want-the-data-a-short-guide-to-gsi-japan-for-non-japanese-speaking-users/|“I Just Want the Data!”: A Short Guide to GSI Japan for Non-Japanese-Speaking Users]]**
 +[[https://digitalorientalist.com/author/pulpbindandbond/|Matthew Hayes]]
 ===== OCR & Kuzushiji Reading===== ===== OCR & Kuzushiji Reading=====
  
Line 34: Line 41:
 **[[https://digitalorientalist.com/2022/06/03/practicing-reading-cursive-japanese-with-miwo/|Practicing Reading Cursive Japanese with Miwo]]** **[[https://digitalorientalist.com/2022/06/03/practicing-reading-cursive-japanese-with-miwo/|Practicing Reading Cursive Japanese with Miwo]]**
 [[https://digitalorientalist.com/author/morrisjh/|James Harry Morris]], //[[https://digitalorientalist.com|The Digital Orientalist]]// [[https://digitalorientalist.com/author/morrisjh/|James Harry Morris]], //[[https://digitalorientalist.com|The Digital Orientalist]]//
 +
 +**[[https://digitalorientalist.com/2022/12/09/genius-loci-extracting-names-and-places-from-japanese-texts/|Genius loci: extracting names and places from Japanese texts]]**
 +[[https://digitalorientalist.com/about-anna-oskina/|Anna Oskina]], //[[https://digitalorientalist.com|The Digital Orientalist]]//
  
 **[[https://digitalorientalist.com/2021/04/09/google-docs-and-ocr-some-experiments-transcribing-japanese-language-texts/|Google Docs and OCR: Some Experiments Transcribing Japanese Language Texts]]** **[[https://digitalorientalist.com/2021/04/09/google-docs-and-ocr-some-experiments-transcribing-japanese-language-texts/|Google Docs and OCR: Some Experiments Transcribing Japanese Language Texts]]**
Line 55: Line 65:
 **[[https://clrd.ninjal.ac.jp/tutorial.html|Tutorials on linguistic corpora (J)]]** **[[https://clrd.ninjal.ac.jp/tutorial.html|Tutorials on linguistic corpora (J)]]**
 [[https://www.ninjal.ac.jp/english/|National Institute for Japanese Language and Linguistics (国立国語研究所)]] [[https://www.ninjal.ac.jp/english/|National Institute for Japanese Language and Linguistics (国立国語研究所)]]
 +
 +
 +**[[https://digitalorientalist.com/2022/12/09/genius-loci-extracting-names-and-places-from-japanese-texts/|Genius loci: extracting names and places from Japanese texts]]**
 +[[https://digitalorientalist.com/about-anna-oskina/|Anna Oskina]], //[[https://digitalorientalist.com|The Digital Orientalist]]//
  
 ===== Text Mining ===== ===== Text Mining =====
Line 67: Line 81:
 **[[https://leanpub.com/japanesenlp|Introduction to Japanese Natural Language Processing]]** **[[https://leanpub.com/japanesenlp|Introduction to Japanese Natural Language Processing]]**
 [[https://twitter.com/mhagiwara|Masato Hagiwara]] and [[https://www.dampfkraft.com/|Paul O'Leary McCann]] [[https://twitter.com/mhagiwara|Masato Hagiwara]] and [[https://www.dampfkraft.com/|Paul O'Leary McCann]]
 +
 +
 +**[[https://digitalorientalist.com/2022/12/09/genius-loci-extracting-names-and-places-from-japanese-texts/|Genius loci: extracting names and places from Japanese texts]]**
 +[[https://digitalorientalist.com/about-anna-oskina/|Anna Oskina]], //[[https://digitalorientalist.com|The Digital Orientalist]]//
  
 **[[https://steviepoppe.net/blog/2020/04/a-quick-guide-to-data-mining-textual-analysis-of-japanese-twitter/|A Quick Guide to Data-mining & (Textual) Analysis of (Japanese) Twitter Part 1: Twitter Data Collection]] **[[https://steviepoppe.net/blog/2020/04/a-quick-guide-to-data-mining-textual-analysis-of-japanese-twitter/|A Quick Guide to Data-mining & (Textual) Analysis of (Japanese) Twitter Part 1: Twitter Data Collection]]
tutorials.txt · Last modified: 2022/12/10 17:10 by prcurtis