George Box is famous stated “All models are wrong, but some models are useful.”

Educational researcher Robert Marzano recommends that a power curve be used to evaluate a student’s current level of understanding. This curve is supposed to be an effective model for assigning a student a score on a 4-point rubric while assessed over time. The curve is supposed to recognize and reward growth over time. You can see in this help document 4 nice examples of the power law behaving as Marzano promises.

However, in my experience, it turns out that the power law is a model for assessing student growth that has serious flaws. Fatal flaws. Here are the most egregious.

The biggest problem with the power law is that when a student has a “whoops” and bombs an assessment, I give them a score of 1. However, if a 1 occurs in the middle of the curve, the curve will not adjust upward and the student becomes frustrated.

3-3-1-3-4-4 = 3.07

I didn’t realize the problem until I had the same student come after class repeatedly to retake the same standard. My goal is that when a student does this, she is justly rewarded with a higher grade on that standard. However, she pointed out to me that because of the 1 she earned after the third assessment, her grade was not increasing.

Other problems with the power law include:

*Students and parents have no clue what how their grade is being calculated (it took me far too long to realize that I could use the indices 1, 2, 3, etc... paired with a student's scores and then use the power regression button on my calculator to predict a score. You will note on the second graphic that the Casio Prizm has an awesome feature of making predictions directly on the scatterplot.)

*You cannot assign a problem as worth less than 4 points. The power law will take this score to show a decrease in ability. Likewise, adjusting for difficult questions is unmanageable.

*You cannot weight one assignment as more important than another.

*If you test the same standard several times at once, there is no way to enter the scores. The best you can do is average the scores together and then enter that score repeatedly.

*As some of my standards are not tested enough times for the power law to take effect, I have to use an average. But this only increases the cloudy confusion about how grades are determined. Some standards on the power law, some on an average.

I am not giving up on grading by standards. I think in Fall I will attempt a new version of SBG that uses weighted categories. Frankly, I am content with neither of the two options I've heard on twitter. Neither taking just the latest score, nor taking just the highest score seem satisfactory to me. More on this later.