http://www.cs.berkeley.edu/~jey/ampcamp6/training/data-exploration-using-spark-sql.html
The exercise at the end ("How many articles contain the word “california”?") requires you to use the "text" field of wikiData. Unless there was some explanation of the schema I missed earlier, it's a confusing exercise to do since the "text" field hasn't been mentioned anywhere. It would also suffice to mention some way to explore the schema (e.g., wikiData.schema.fields).