Excel has a lot of quantitative power, but it has always had issues with parsing text. While there are the basic text functions (left, mid, right) and a useful substitution function, sometimes something that seems like it should be easy isn’t. As a case in point, if you want to extract individual words from text, while it can be done with the Text to Columns feature using the space as a delimiter, it is not very easy to do with a formula, because you basically need to find where a certain space is within the text, which means counting where the previous spaces are, which requires a loop. Which then requires VBA. So, I did a little VBA coding tonight, and tried to make a function that not only can find single words, but also phrases, as n-grams. I have now used this to do some qualitative analysis of text, where combining these functions with Pivot Tables, I have been able to have Excel be a decent basic qualitative analysis tool for text.
Here is the VBA file for the code, since when I tried to post the code inline with this message, my WordPress host thought I was trying to do some sort of injection attack or something. Also you can find an example of how I used the code in the analysis I recently did of Literature and Medicine college courses.
Update: I have improved the efficiency of the code by using VBA’s split function instead of using a lot of extra loops.