Sunday, March 16, 2008

Should have learned bash....

So I looked at the corpus and, since we wanted to shorten it a bit more, but had no real good selection mechanism I created a small program to do the selection for us :) I hacked it together in a short Java app. However, it reminds me that one day I should really learn to do this kind of simple text parsing/filtering with a nice scripting language. Perl, or just bash would be nice to know. At least the basics... oh well another time, I mean the java program works! Committed both the filtered corpus and the program to the subversion repository... check it out! Will work a bit on the thesis text now.

No comments: