A Standards Based Grading Deep Dive – Part 2: How We Assess Our Students

The Back Story

It’s 2019, a Friday afternoon in October, and I’m driving home from school. It’s been a tough day and my brain is absolutely cooked from making 8,000 decisions during my Math 8 classes and giving a cumulative exam in Enhanced Math 1. Seems like giving a test should make for an easy day, as you don’t have to do much, but that’s not the case. Stress levels in students are high. With stress and high expectations comes the willingness to compromise morals and desire to cheat. My attention must be laser focused to make sure students are working with integrity. It’s…not fun. I know there are other ways to assess students, but the most authentic assessment of each student’s ability is to assess them independently (as far as I have found, anyway).

Then there’s the grading. Before switching to Standards Based Grading, we used a traditional points based system, assigning point values to each question, then deducting points from a question if work or formatting was incorrect. With about 100 students taking a test that is about 20 questions long, that’s examining 2,000 test items, most of which have multiple steps of work. I might give 1 or 2 well-crafted multiple choice questions, but 90% of the exam is hand written work with many steps to inspect. When I stack up all the exams, shove them in my messenger bag, and toss the bag in the car, the ride home feels so daunting, knowing I must spend the next 8-10 hours grinding.

Having already gone over the way I am grading student work now using the 4-point rubric, let me just say that it is so much better than itemizing point deductions for each question like I used to. I would drive myself crazy trying to determine if something was minus 1, minus 2, or more. I even got to the point where I was deducting one tenth of a point on certain questions, which in hindsight was absolutely insane. Like, what was I doing???

Target Specific Assessments

One of the best changes we made this past school year was how we assess our students. Before 2020 we would give one large assessment each month, which was always cumulative up to that point. That meant that at the end of February the students would get the “February Test”, which could have any topic on it they learned from August until about mid-February. We emphasized more recent material, and the old stuff was relegated to a few questions on the essential Learning Targets. The test was always worth 100 points, and we used a year-long gradebook. By the end of the school year the gradebook had about 1,200 points in it (including the monthly tests, quizzes, and homework). 

The rationale was that we wanted students to maintain their skills throughout the year, instead of simply learning something for a short time and then never recalling it again because it would never be assessed again. While I agreed with this premise, the downside was that these tests were very stressful for the students, usually took up an entire block period to administer, and took an extremely long time for me to grade. Each exam would have around 20 questions on it of varying Depths of Knowledge, so grading around 170 of them each month was mentally exhausting.

So instead we switched to more frequent Target specific assessments, focusing on only 1 to 2 Targets each. The assessments were much shorter, able to be completed in a 51-minute period by most students, and each Target could be covered by a variety of questions at different levels of rigor. We included spicy peppers to indicate to students which questions we considered more challenging, and those were the ones they should get correct to be considered having  “Thorough” understanding of the Target. Here is an example of an assessment I gave last year in Math 8:


The Benefits of Target Focused Assessments

In 8th grade we gave 14 different assessments that covered 20 of the Learning Targets for the year. This meant that I graded assessments more frequently, but the assessments were much quicker to complete. Whenever I assessed Math 8 I was able to grade both class periods in under one hour, usually on the same day I gave the assessment. I could literally never do that before. Many times students would take the assessment on Friday and I could hand it back to them on Monday. During the days of grading a cumulative test it might take me a week or more to finish marking everything, therefore the feedback took longer and was less valuable.

One of the best results from the more Target focused, smaller assessments was that students were not as stressed out or overwhelmed. Since they were shorter and more focused, students were able to finish them in a reasonable time period, and students with IEP’s and 504 plans did not need to use their time accommodations as often. Additionally, with the retake policy we adopted, students knew that they always had a second chance to take a different but similar version of the exam, so if they just weren’t feeling it the day of they test, they always had the change to try again.

Giving these shorter assessments also gave me more flexibility on the day of the test. Since most students would finish with additional time, I was able to give them some more interesting tasks to do once they were finished. I now post Open Middle problems at my thinking stations, Non-Curricular Thinking Tasks, extension problems from previous Targets, or desmos activities that preview the next Target we are going to learn. Assessment day is now a “show me what you know, then go find something you are interested in” kind of day, rather than a stress-fest of feverishly working until the bell rings.

This isn’t to say that every student was instantly successful the first time, or that my assessment results were amazing across the board. In Part 3 I will look at how students did overall, how they reflected on their own results, and whether the retake system worked for all students. See you next time!

A Standards Based Grading Deep Dive – Part 1: The Grading Rubric

If you ask 100 classroom teachers what the least favorite part of their job is, I am willing to bet that at least 80 of them will say “grading student work”. Well, that might not be accurate. Almost all of them will say “mandated professional development”, with grading being a close second. Having taught middle school math for 2 decades I can safely estimate that I have assessed at least a million math problems that my students have completed on some kind of assessment. Don’t get me wrong, I get a tiny spark of joy each time a student gets a question correct (Yay, they learned the thing!). It’s just very time consuming, and I know that every time I grade something, there will always be a small number of students who are going to have some seriously negative emotions when I hand it back, whether they do horrible, or just get one question wrong. Too many emotions tied up in points, grades, and self-worth.

So two years ago the Math Department at my school switched to Standards Based Grading, with the hopes of giving students better feedback on their learning, an improved sense of hope and efficacy, and a focus on the learning rather than the grade. (I wrote about this back in October if you would like to read that first). We developed a whole new grading system based on a multi-point rubric for each Learning Target, offered multiple chances for students to be reassessed, and removed mandatory homework for points in the gradebook. It was a lot of work, but work worth doing. Or was it?

So instead of just going on feelings, I wanted to reflect on how last year went, and look at the data available to me to see if the changes are working as intended. It’s quite the journey, so I plan on looking at this in multiple posts, otherwise this blog will be gigantic. Let’s dive in to Part 1!

Part 1 – The Grading Rubric

Two years ago we started off with a very basic 5-point scoring rubric for each Target to ease the transition from a traditional gradebook to an SBG one. Here’s what that looked like:


This gave a simple 20% breakdown for each letter grade, so an A was 80% – 100% and meant that more often than not a student had “Mastered” the Targets in the class. Numbers-wise this was easy for parents and students to understand. In application, things got really weird when we tried to grade an assessment. Any teacher who has assessed students for a while knows what “Mastered” and “Beginning” look like. It was the middle area where there was a lot of subjectivity. I personally had many instances where I could not tell the difference between “Proficient” and “Approaching”, as did all of my colleagues. About halfway through the year we realized that this needed to change, since we kept having long discussion about what was Mastered versus Proficient, and Proficient versus Approaching. While grade norming is essential in a PLC, you can’t spend all of your planning time doing only that.

So last year we transitioned to a 4-point rubric, which is most often advocated for when you look into SBG practices. We also developed more language to help ourselves and our students know the difference between each level of understanding. We also updated the category language, since “Mastered” felt like a weird and highly subjective descriptor. So here’s what we used last year:

I really liked this rubric more than the previous one. Since there were less levels to consider, it was easier to see from the student work where a student was at. The only place I ran into trouble was telling the difference between “Thorough (4)” and “Adequate (3)”. Sometimes it was just really hard to tell. More often than not I would assign a student a 3, then meet with them to go over their work and talk about what needed to improve to reach a 4. Since they could retake any assessment, this always felt good. It’s not like they were stuck with that score.

Let’s look at one of the assessments I gave last year, and how I graded it for a few students. Here is the very first assessment I gave in Math 8 for Target 1.1:



One other practice I personally developed to help me determine proficiency levels was to use a spreadsheet I created for each Target assessment. As I examined each question I would grade the response using the same 4-point rubric and enter the score. I had the spreadsheet average out the scores for the entire assessment, then use the number as a general guide as to what level the student was at. Here’s a link to a sample I have for one of my Target assessments for 8th grade.


One of the tricky things about this method of grading though is that not every question is the same level of rigor, so the average score doesn’t really tell you the proficiency level. For instance, question #8 required the students to create their own equation using an “Open Middle” structure, then prove that what they created met all of the criteria needed. This is way different than question #1, which was a basic two-step equation with only whole numbers. This is where the holistic approach comes into play.

For example, let’s look at Student #6 and Student #7. Both students got an average of 3.7 on the assessment, but one of them scored an Adequate (3) and the other a Thorough (4). Why is that? Since Student #6 got questions #5 and #6 wrong, and those were considered less rigorous (they were basic equation solves), I found them to be at the Adequate level for the entire Target, but not Thorough. Student #7 got two questions wrong as well, but there were some factors to consider. For question #7, they made a simple calculation mistake in the final step of the problem. Not a big deal. I don’t really downgrade students’ proficiency level because of a simple calculation mistake. For question #9 they were able to circle the part of the work that had the error in it, but this student was a first year English learner so they did not have the vocabulary needed to do the written explanation correctly. I could tell that they understood the overall concept. That’s an English problem, not a math concept problem. They have Thorough understanding of solving equations, so lowering their score because they have only spoken English for 6 months is not appropriate.

This is why I enjoy Standards Based Grading, but also why it can take so much time to do. When all you do is give points for correct answers and turn the points into a total score of x/100, you lose the big picture. Even though Student #6 got a high average score, they have a few misconceptions in their equation solving that I still needed them to work on. If I give them a Thorough on the Target, they are less likely to work on the misconception. This way, with some coaching and a bit of intervention they are able to re-assess later and earn a 4 on the Target, should they have the desire to.

In Part 2 I will examine the types of assessments we gave in class, and how changing to shorter, more focused assessments has benefitted both me and my students.