Thoughts on Resilience & Grading

I spent the last three days helping to facilitate a leadership retreat for some of our rising 10th, 11th, and 12th graders. This year’s theme was resilience, which we linked closely to one’s relationship with failure.

In several different ways, we asked students to reflect on the extent to which the school provides opportunities for them to fail, process what happened, make adjustments, and persevere through a difficult situation.

As we concluded the retreat this morning, we invited the students to consider how they and the adults at our school could facilitate the development of resilience during the upcoming school year. I was overjoyed with the first comment a boy put forward, which he intended for both students and adults:

Too often we get so focused on grades that we lose sight of the learning. Let’s keep the conversations about the learning rather than the grade.

I was blown away because I had hoped a student would bring this up, and this boy came right out with it. I’d like to make some strategic changes in my messaging around grading, reporting, and assessment this school year, and making the connection to resilience explicit could help keep these shifts rooted in a value to which the community has expressed a commitment.

My guiding question is this: What grading, reporting, and assessment practices (and policies) most effectively promote resilience in students?

There are many broad categories of issues come to mind, but in my current context I’d like to focus on redos and retakes.

I would like to try to assemble the most concise, convincing evidence that allowing multiple attempts at demonstrations of mastery facilitates the development of resilience. (I would go further and say that the practice of averaging in the scores of unsuccessful attempts impedes the development of resilience.)

Here’s a selection of articles I’ve read that support this view.

As Thomas Guskey writes in On Your Mark, we won’t get very far if we don’t agree on the purpose of grades, so the goal here is to convince someone who believes that the primary purpose of grades (in math class especially) is to summarize performance on one-time tests (via the arithmetic mean).

What do you think?

  1. What grading, reporting, and assessment practices (and policies) most effectively promote resilience in students?
  2. What is the most concise, convincing evidence you know of that allowing multiple attempts at demonstrations of mastery facilitates the development of resilience?

P.S. The value of mastery-based (competency-based) learning has begun to make its way to the independent school world as well: in this article from 2014, David Cutler writes about his expectation that traditional grades will be obsolete by 2034.

Early Steps in Grading Transformation: An Email to My Colleagues

Hello fellow math teachers,

I’d like to share an email I sent to my department this morning describing my experiences moving forward with mastery-based learning and standards-based grading. I’m working to move towards the ideas articulated by many in the #ttog community.

I’d welcome your feedback on the sample progress reports, grading frameworks, or presentation of ideas I’ve put forward below.

– Tom

Dear math colleagues,

I hope you’re having a wonderful summer! I’ve been back in DC for about a week now after teaching two math classes at Phillips Academy in Andover, MA for a residential program called (MS)^2: Math and Science for Minority Students.

After reading On Your Mark by Thomas Guskey at the beginning of the summer, I decided to use the classes I taught at Andover as an opportunity to put together a “proof of concept” for a standards-based method of grading and reporting. In the spirit of moving forward with the conversation several of us began at the end of the school year, I’d like to share with you a method of grading and reporting I have been working on for a few years and had a chance to refine this summer.

I’ve attached a sample end-of-summer progress report for each class I taught:

A few notes for context:

  • I saw each class of 13–14 students for 110 minutes in the morning and 70 minutes in the evening every weekday for five weeks.
  • “Math IA” had the bottom third of the rising sophomores and “Math IC” had the top third.
  • Phillips Academy uses a 1–6 scale for summative grades rather than letter grades. The official labels are as follows:
    • 6—High Honors [at least ~93%]
    • 5—Honors [at least ~85%]
    • 4—Good [at least ~77%]
    • 3—Satisfactory [at least ~69%]
    • 2—Low pass [at least ~60%]
    • 1—Fail [at least ~40%]
    • I included a key with more specific interpretations of these labels in the progress reports.
  • The back-end of these progress reports comprises an Excel spreadsheet and a mail merge in Word, so it’s relatively easily to produce report cards on the fly once it’s all set up.

I wanted to reflect these ideas in putting together this system:

  • Each course was designed backwards from the learning targets, which were given to students up front so that they knew exactly what the expectations were.
  • No summative grade was attached to any particular assessment. Students received written feedback on their work as well as progress reports reflected their current level of mastery on each learning target.
    • Scores were attached to skills rather than assignments.
  • Each learning target was scored on a 1–4 scale. (A key for these is also included.)
    • The code to the left of each learning target is a reference to a section in the textbook so that students could easily look up examples and additional information.
    • The summative grade for each unit was achieved by averaging the learning targets for each unit.
  • The final exam, which was cumulative, focused on those skills for which the class as a whole had the lowest scores, so as to provide the greatest opportunity for demonstrating improvement.
    • Students could bump all the way up from a “1” to a “4” for a particular learning target if they demonstrated mastery on the final.
    • If a student had significant trouble with a learning target on the final, they could bump down at most 1 level. If they already had a “2,” that score remained.
  • The summative grade for the course was achieved by averaging all of the learning targets from the course.
    • The method for converting from 1–4 to 1–6 is described below.
  • Throughout the summer, students had the chance to demonstrate that they now understood something they previously did not.
    • This could take the form of a short interview or answering a brand new question addressing a given learning target.
  • In order to earn the right to another attempt, students were required to engage in additional learning (making corrections, completing practice problems and checking answers, making flash cards or graphic organizers, etc).
    • In addition, students could not ask to demonstrate new learning on the same day they’d received tutoring from me. I would tell them, “I need you to sleep on it and try it tomorrow without my help so we can make sure it made it into long-term memory.”
    • Students were repeatedly told, “Over the course of the summer, you will have multiple opportunities to show what you have learned. The only truly final opportunity to show what you know will be the final exam.”
      • Consequently, students could always improve their scores on each learning target. Scores of “1” and “2” were treated as “not yet” rather than “failing.”
      • The stakes for any one assessment did not feel unmanageably high.
  • Homework completion was reported separately from mathematical achievement.

After a period of adjustment, nearly all students came to internalize the growth mindset implicit in this method of grading and reporting, and reviews were very positive.

Naturally, there were plenty of areas of improvement as well:

  • I tried to capture too many learning targets, and they were often too granular.
    • For example, I’m not sure that “I can identify the intervals over which a function is increasing, decreasing, and constant”is significant enough to merit its own learning target. Perhaps this specific skill belongs under a broader learning target.
    • On the other hand, I found “Using a table, a graph, or an equation, I can explain what it means for a function to be quadratic” to be a useful piece of information to capture and report on.
  • By averaging all the learning targets, I sent the message that all learning targets were equally important.
    • In reality, I’ve written learning targets requiring different depths of knowledge. It might be better to explicitly group learning targets by DoK and to calibrate the distribution, and I imagine this distribution would vary based on the level of the course.
  • Broader learning goals, such as mathematical practices and habits of mind, were omitted.
    • Goals such as communication, mathematical reasoning/proof, modeling, attention to detail/precision etc. are not explicitly measured or reported.
    • A colleague of mine has done some excellent work in enumerating these types of goals, and I’d like to try to pick a few of them to focus on this fall.
    • This summer, I generally didn’t penalize students for careless mistakes if the core understanding seemed to be there. However, I don’t want to send the message that attention to detail isn’t important, so I’d like to find a way to capture some data about precision.
  • The conversion process to achieve summative grades was somewhat arbitrary.
    • Here was the scale I used; note that the bar is slightly higher for the upper-level class:
      • 6: At least 3.7
      • 5: At least 3.2 (For Math IC, 3.3)
      • 4: At least 2.7 (For Math IC, 2.8)
      • 3: At least 2.3 (For Math IC, 2.3)
      • 2: At least 1.7 (For Math IC, 1.8)
      • 1: At least 1
    • I’d like to explore how this might look for converting to letter grades.

What I’d especially like feedback on:

  • How many learning targets seem reasonable for a math class with ten units?
  • What range of cognitive demand (depth of knowledge) should be required by a learning target?
    • How should the answer to this question change based on the level of the class?
    • Should learning targets be framed in terms of the Mathematical Tasks Framework, the Transfer Demand Rubric (Proposed Grading Framework), or some combination of the two along with Webb’s DoK taxonomy?
  • What types of cutoffs might make sense for converting from a 1–4 scale to a letter grade scale?
    • For example, should the gap between a B– and a B be congruent to the gap between an A and an A+?
  • What is the most effective way to measure and report attention to detail, precision, and avoidance of careless mistakes?
  • Anything else that comes to mind.

Thanks for taking the time to read. Again, no pressure to reply—just wanted to get these thoughts out while they’re fresh.

OK, back to summer!