The earlier attempt to create a Telugu Wikipedia dump resulted in not only corpus but gave me a ready reckoner Wikipedia articles at hand.
I could access Wikipedia articles one text file per article and it helped me run various string processing commands on these per page wiki article pages and showed me many new insights.
Otherwise difficult to find out, many disambiguity pages could be easily found out.
Here, today, I tried to use a method mentioned at kdnuggets. The procedure was simple, a set of two scripts was given. The first script would take xml dump and convert it into a single text file with space seperated words of Telugu Wikipedia dump. There is no taxonomy, no distinction, the entire text of Wikipedia is dumped into a single text file.