Japanese Text Analysis with Visual Novels and Anki – Learn the Most Common Words, Read Faster!

all right here we go so today I'm gonna show you guys how to hook into the visual novel set it on auto play so you can get all the text from the visual novel run a text analysis sample on it have the word into Augie and figure your hockey add-ons so that we can get the Fergana over the kanji all right Scout our visual novel here angel beats first beat now I might be able to find the exe for it so usually you could check the folder and find the application that's being used in your visual novel so I just want a sick listen good so we're gonna take ith vnr open that we're gonna hook it into angel beads first attached then once once it attaches will add the profile okay for the options you can turn off the auto copy to clipboard and operations how are we gonna look for the text for angel bees it's usually one of the later tabs under here so this looks like it's pretty much these text here so ready go forward a few lines to make sure we're doing pretty well now turn on the autoplay it's not gonna make her stop talking so we're gonna we're gonna go to the concert put a sound uncheck everything or just turn it all the way down whichever you want to do in any visual novel turn off the voice is here your text and this speed is normally for just when you hit the enter key and this is like the autoplay speed it's it's something it's basically like the lines per character all right what do you call it need the amount of like seconds per character and the amount of seconds per line pretty much and it gives you this right here so you can turn it like wherever it was I turned it down this much I thought that this made it pretty fast and then you can turn on autoplay and you'll notice that you're like Europe a appeared it was can see question marks real fast and it pretty much goes through the text really quickly you can just pick this choice now if you're looking for a guide purpose on although I'll show you can do so if you're looking you look up your Bristol novel and BND because say don't auto-type the Japanese and so I can just do angel beats walkthrough search that up and say uh Saiga usually has pretty good walkthroughs so you can you know just use your walk through all with a visual novel have it go on autoplay and every time there's a choice you just do that choice so you know you can check them off from the site or you can save the guy to a new text file and do whatever I like to I like to draw a little X's here X out the choices so I find that works for me don't really need the guide I don't need a guy there now so have it go through and get the text pretty fast but I'm gonna stop it for now so with this text we can hit ctrl a to highlight and copy everything open a new notepad window and paste the text and with control V and save it as new visual novel text now I already use this file so I'm gonna overwrite this file and now we have our visual novel text Sarika does minimize Angel Beats and now we're ready to do our text analysis so you can close this don't need this text analysis tool open this up now input file desktop the end copy text output I'll just put it right to the desktop it's gonna generate files for each one of these things here that you have checked these are based readability you know add a file with the list of all the words you know for that so here we're just gonna do a regular comparison regular text analysis don't ready to change any settings which is it analyzed and then oh you can open directory or hit OK because if we're just sending it to the desktop for now so word frequency report is what we're gonna be working off of you open this up what I like to do is copy this entire thing and just make a new Excel spreadsheet make a blank just paste it right in and now what you're basically gonna have is your list of all the words now if you go down this list what what I find to be one of the coolest assists exsist on right here which is basically like the top percentile of words that you need to know in order to or in order to it's like the percentage of words percentage of like word usage so if you know up to word 171 if you know all of these words then you will know 80% of the word usage in the entire visual novel it doesn't mean 80 percent of the total words but it means that the because of the usage for these words is so high the top 170 words 171 words comprises 80% of the text in angel now it's not going to be exactly like that because we just got a small sample that's finding a large sample of five 10,000 20,000 lines which I've done for several visual novels that I'll link later and my guide but yeah so from this you can copy this just eat words copy that into another Excel spreadsheet blank and paste it in here like to do is um paste it down up I'll paste it a line below so I suppose I can do so what I like to do is put the kanji version here so I just would be like she needs to be in kanji and then here you can actually do the Fergana like this I changed my text she knew so you got kanji version Farah gonna win in brackets and definition and what you can do is just sort of add them into this list and then you can save this file that's top and save type the csv comma-delimited is good because you can import your list of vocab words into aki so you can do like vocab import and say that yes or I like to do is use chi trans I open the co endo change the options so that your dude you're capturing a text on a clipboard I hadn't enabled already so I'll just read in able it I'd like to say with it and we can do is control C and copy like we can go down here to some real words you know SEC I control C and there it is okay and you got your you got your Fergana that appears over top of it and your the definitions you have names whichever you want to do and what you can do is you can you can try to add the Fergana here so you would do it this way you it cuz the guy goes over ii kanji and there you go my scrolling ability just disappear the world whichever driver definitions you want Oh cause that one yeah someone else's everything is being straight for a bit what else for you can do is I made this empty on key profile so I'm going to close out some of these things so that's easier to tell what's going on here so what you're gonna do is you're gonna go into the translation tools pack whatever open the Japanese cards format deck what this is gonna do is normally it'll import it'll be like all new stuff this is gonna add a the vocab you have I want to show you how you can make your own new vocab deck so you hit add deck change it to add okay okay and now the card type is going to probably set that basic as default if we have three different cards higher stead of made Japanese Fergana one – kanji – will basically make a card that a one day be the Fergana and the next data card appears or the next version of the card will appear that'll just have the kanji this is kanji plus Fergana this is only kanji and faragonda on the front side you'll never see a word by own only its kanji it'll always have the fur down it and this is for kanji only say you know a word like baka so see I already know pakka but you don't really know by the kanji so I might be useful to add a card like this so you add it in now what you have here is an audio field which I like because you can go to for vote calm copy the kanji search the word pick your voice actor I like strawberry Brown open the folder you can control C or copy the sound file close that up yo paste it right in add it and we can add some more cards we can add a forgot a one kanji to card so say like I don't know fish happy fish get another audio thingy copy of Santos spam but ideally what you're gonna be doing is you're gonna be going through this list and adding all these words a lot of these are going to be grammatical words at the top so your should go through take him or some other grammar source of your choosing Genki whatever you want and then go through primarily the words that you don't know you can even learn some grip some grammar that you have an experience there just because you'll you'll find so much of it when you read a visual novel so we can add this for it nad tenshi this 10-sheet and certainly I didn't know what it was so we just copy it and then check to trans penci you might be able to copy the Fergana from here sometimes it's kind of tricky I had control C and it gave me it there you don't actually need the forgot a meaning right I can get the audio can record your own audio but uh I'm not gonna need that you could whatever and danci fancy tents like us and I get that copy tenshi BAM oh yeah well you could do here is you could uh you could have thought this way I know what you can do is actually I can remove this car mystery you can get rid of the tension car close that up it was the deck song key you can you can do file import and we can wait and we can basically do is uh you can import this word this word and this word because all three fields are filled out all three columns that we have in Excel and you can go to here you can go to the vocab import this is a way where you can search up a lot of words at once without having to deal with Aki you can go back and like add the audio later or if you don't want to add audio we've got a vocab import CSV so we have a comma delimited because we say that a csv comma-delimited file or HTML is useful it doesn't really you can turn I I would keep it on because what you can do is I don't know say if she knew has another definition that we want to seize to stop you can put a line break with this codices stop and I'll show you how it appears later forget it save and then actually I'm gonna I don't know if at all I think it already opened the file so not is it and Dec vocab comma-delimited so our format was the expression and then the reading and then the meaning it's right to switch these two and then we get poor oh not utf-8 not sure if there's a you can do text tab delimited I think I might have been how I used to do it and then close this and then you have to like change this up and save it as a utf-8 and then replace that and close it now it's tab-delimited now I can import allow HTML you Dec reading meaning import three notes added OneNote unchanged yeah Larry can browse here enter hey you got she knew and the line-break changed them to enter all right so if you're not getting the Fergana automatically when you add in your Richie you don't get this forgotten automatically it's because you don't have the Japanese Adeline so he can go to tools browse for add-ons the Browse button takes you right to the page the Japanese support add-on is the one you need copy this code I used to okay it'll download the add-on then get to reinstall monkey I like this one other add-on the fullscreen toggle fullscreen oh we got an error because I already have it in here and yeah hopefully it doesn't break my head on toggle full-screen fakery star donkey and then your add-ons should be working well yeah that's pretty much what you want to do to be able to get a large amount of where it's a visual novel and I can show you some text analysis that I have I have the angel beats analysis so from now here on out the videos extra I'm sorry showing you that here's the angel beats analysis from a large portion of the visual novel you can open an Excel spreadsheet paste it in and what you can see is that sig reach like that eighty percent proficiency you only need 531 words so if you learn the top 531 words you will know four out of five words that appeases you just roll down further if you want to hit that 85 percent you need eight hundred seventy eight words to reach that amount needed with that fresh nine percent 1492 we'll get you to that 90 percent proficiency and this is really cool because a lot of these words are gonna have so much overlap in different visual novels so if you study this it's set up working on like core 2k you're gonna be able to learn the most common words used across several different visual novels doing it really efficiency using efficiently using the text analysis tool and you want have to save a lot of words in the middle of reading visual novels you can just read and then look up grammar look up things you need to figure out and at the end of the session you can go into the common words list that you have here and just start adding in the words to Aki and add in those four voto files and you'll soon be able to read much more quickly I've been recently adding a lot of words from I you know use DF and in reaching a high level of proficiency in it I think I might like 88% I know like me 88% of the word usage in use dia and before I was saving words randomly so you might save a word that's like this word that's used only once or this which seems like a useful word it's got it's got tea shop in it might be a cool word but it's not gonna get you to that high proficiency level so that's why I'd like to save words this way instead of just saving sporadically I found that a small period of time and just gained so much proficiency this way but yeah hopefully this helped and not sure if anyone knows about this method it's really cool about the text analysis tool is that you can actually use it for subtitle files too so I can show you some of that also I have two keeps your neck there it is hi got a giant subtitle pack here so or you can just uh pose I have these things say you want to know this the songs from this show Oklahoma about you Gus can I open this up strike this okay subtitle files this is actually Chinese also so I think this one it's probably not Chinese so I use this one instead go to Japanese text analysis open that up and five file input directory so we're gonna put this in its own little folder I bet in their directory desktop gay hit okay and now you can do is this is put that in the same folder keep it simple you can get that text analysis for all the subs of an anime so you can watch it while looking up words and stuff and learning the words like you know just learning the grammar and loosely focusing on stuff as you go not having to like write down every word that you come across that you don't know your text analysis on it thanks a little while because it uses a bunch of different files and then hit OK or you can see report and I'm gonna see a lot of those same particles and cool stuff but now you can use this to learn the most common words in anime as well

8 thoughts on “Japanese Text Analysis with Visual Novels and Anki – Learn the Most Common Words, Read Faster!

  1. Very nice, thank you. By the way that's not how you pronounce δΈ–η•Œ, just wanted to let you know.

  2. Hey man, I loved your video, helped me a lot πŸ™‚
    Quick question, where are these large samples of visual novel text you talk about? I can't seem to find them.

  3. Hi i have the same Text Analyzer, and i did something to it. Now the word frequency won't come up,just the Kanji report. I use a text file and everything is exactly the same as your's in the video.Ive also downloaded and redownloded it … can you give me advice please !

  4. Thanks for your video, it's great and useful for learning japanese. But i have a problem when use addon "Janpanese support":
    I get this error on my anki, i use window 7 and don't know how to fix it.
    Pls give me some advice to fix this.

    "An error occurred in an add-on.
    Please post on the add-on forum:

    Traceback (most recent call last):
    File "aqtwebview.py", line 18, in run
    File "aqteditor.py", line 462, in bridge
    File "ankihooks.py", line 32, in runFilter
    File "C:UsersAdministratorDocumentsAnkiaddonsjapanesereading.py", line 211, in onFocusLost
    n[dst] = mecab.reading(srcTxt)
    File "C:UsersAdministratorDocumentsAnkiaddonsjapanesereading.py", line 81, in reading
    IOError: [Errno 22] Invalid argument"

  5. your tutorial is very helpful .. big thanks
    i never seen someone who points major things like you
    you are so amazing
    i will try and use every bit of your method
    the google sheet is well written
    it's really helpful thanks

  6. Consider using anki plugins to automate routine: "Automatic Japanese Dictionary Lookup" to generate definitions and reading (you will need to delete unneeded definitions though, it generates too much) and I am pretty sure there is a plugin to bulk-generate voice reading files too.

    Also "Japanese Example Sentences" might be good addition.

  7. JTAT is giving me everything BUT the word frequency, I made sure to do excactly as you did (Even used ASCII encoding and told JTAT it was shift_JS (even though it made my computer nerd heart suffer) and it still just gives me a blank text file… Do you happen to know how to solve this issue?

    EDIT: SOLUTION FOUND! Anyone with the problem, here's how you fix it. The starter pack linked here is apparently broken, you need to download JTAT from sourceforge and not the folder he gives in the google drive link.

    EDIT2: How do you deal with duplicates? When analyzing multiple VN's you're bound to get the same vocab multiple times, do you manually sort it or what?

