Writing the previous post got me thinking about the nature of how programmers use programs and how it relates to teaching in general.
A couple of years ago, I wrote a little prototype address book that accounted for my misspellings. The idea was that if someone had a name like Karalyne, that even if I spelled it "Caroline", it would find it.
To do so, I used the Soundex algorithm in Python:
>>> import fuzzy
>>> soundex = fuzzy.Soundex(5)
>>> soundex('Caroline')
'C6450'
>>> soundex('Karalyn')
'K6450'
Then I used the Levenshtein algorithm to determine the best match, and as the user typed, the matches would become better.
When I wrote the program, and even today, I have no understanding of exactly how Soundex works. I know that it transforms the string into a set of values which correspond to how it sounds, but I don't need to understand its implementation to use it.
That's the way most libraries work- one doesn't need to understand them in order to use them.
In this TED talk, Conrad Wolfram argues that this is the same way that we should be teaching math to children, with an emphasis on usage first, and understanding afterwards.
His examples include geometry, but the same arguments could be made about statistical analysis.
Most people working with a dataset will need to know what questions to ask- determining the type of data and knowing what questions to ask is the most important thing. Figuring out whether or not the data fits a certain model will help you determine the type of analysis that needs to be done next.
If we taught statistics this way, with an emphasis on the analysis, rather than the process, we could teach statistics not in college, as Arthur Benjamin suggests, but as earlier, even as early as Algebra. The concepts in a subject like statistics are concrete, they're tangible, and it would be possible for students to be as comfortable using them even before they understand them, just as I'm comfortable using Soundex, even when I have only the most rudimentary understanding of how it works.