What purpose do schools serve in society? Obviously, they are a site for learning. For some, they are a site to train the next generation in the skills needed for a strong economy. For others, they are a site to train more thoughtful and active citizens. With the recent mass closure of schools (or mass movement to schooling from home), we’ve had a chance to reflect on what else happens at a school.
We argue that schools are extremely complex sites that do the following:
1. Schools keep kids safe so that adults can go about their business running the economy
2. Schools provide environments for emotional, civic, and social development
3. Schools are hubs for social welfare programs, often for those with the greatest need
4. Schools bring communities together
5. Schools address the unique and specific needs of all learners
However, schools are not held accountable for all of these factors. Instead schools are primarily held accountable for the learning evidenced by end-of-year tests, or graduation rates, or other easily countable statistics.
Punya and I argue that during this time of global crisis, education leaders can a) reevaluate what schools really do and b) rethink school accountability.
All this will help frame school as a place, above all, to inspire.
A tweet the other day got me thinking. Can tests somehow show care?
Testing has a very bad reputation. One of my most common jokes when explaining what I study goes something like this:
Person: What do you study?
Me: I study standardized testing … …. So, yes, it’s my fault.
It seems that testing gets blamed for a lot of the ills in education. However, testing (and the consequences of testing) absolutely deserve some blame.
Testing, Grading, and Motivation
For one, some of the best learners that I know stay motivated by being curious. Test scores (and grades, in general) often provide fleeting motivation. Students may focus on grades or test scores, but, in turn, forego pursuing something that they feel passionate about. In other words, they turn to external motivations (e.g., prizes, money, grades, scores, etc.) instead of internal motivations (e.g., curiosity/interest).
In the short term, this type of motivation can be extremely effective (especially for overcoming something that you would absolutely not be interested in otherwise). However, in the long term this leads to burn out and students who ‘check out’.
As educators, we want motivations that are sustainable and geared toward the long term goals/interests of the student. Tests can get in the way of these sustainable motivations.
Standardized testing can also dehumanize students. They are stressful. They are impersonal. They compare something that may not be comparable. At best they are a little unfair (bias is inevitable), at worse they are really unfair (e.g., a math test written in English given to a math whiz who doesn’t speak English well).
Tests, which were developed originally for speedy military job training placement, treat students as resources. Which, in a war, when time/efficiency is of the essence, and sacrifices must be made, could make some sense. However, this should not be the standard with which we as a society motivate our students or evaluate our teachers.
Tests Are Not All Bad
However, tests themselves are not entirely to blame. Tests are really useful. Parents, teachers, and students can learn a lot from looking at test scores. Tests really do have some predictive power regarding academic strengths and weaknesses (as do grades).
Also, test developers really do care (I know, I’ve worked with many). It takes a lot of money to develop a single question for a large standardized test. The test questions themselves go through a rigorous process to try to root our as much bias as possible. Test items are never perfect, but test developers do try hard to be as fair as possible.
Test that Care?
So, what are some first steps to humanizing testing? I think we need to start by emphasizing standardized tests less. They are meant to be one of many types of measurements of learning, yet policymakers and the media (and parents and students, etc.) seem to harp on test scores. Test scores are treated as the most “objective” measurement of learning/achievement. But really, they shouldn’t be treated differently than any other type of measuring stick (like grades, or recommendation letters, or a portfolio of student work, etc.).
Test developers actually often decry the way that test scores are used (though they also benefit from it as well). For example, one of the largest organizations of measurement scholars (the National Council on Measurement in Education) published a statement denouncing the use of test scores for evaluating teachers. But, most states still do it (See my open source article for more).
Also, every year the media reports on how US students are so far behind other countries (based on PISA scores), but those numbers are extremely hard to compare fairly.
So, is there a way to teach policymakers, media members, and others to weigh test scores less heavily? I think we would all feel better if we knew they didn’t matter so much. Think of it like your score in a video game. It matters, you pay attention to it, but mostly you focus on other goals, like having fun and completing the game.
Also, is there a better way to make tests? Can we make tests more localized? Can we take teacher opinion into account?
Standardized tests feel so foreign and so … standardized. They are created by large companies in New York (e.g., Collegeboard) or large non-profits in Iowa City (ACT inc., I see you) by a small group of hyper-educated experts called psychometricians. These psychometricians are all-stars at understanding how to make tests more valid and more reliable based on a certain set of standards that should be measured.
Personally, I think it would be helpful to connect these psychometricians with teachers or parents. Good teachers/parents have a lot of questions about their students/children. Did they understand that deeply or not? Why are they struggling with this or that? Are they interested in this or that? But, good teachers are already overworked creating content and parents may not have the expertise to create ways to measure their child’s learning. A professional test developer could help assess students. These assessments would not be high-stakes instead they would be helpful and personalized (or at least localized). They might even be humanizing.
In fact, good tests can often teach students. The tests, themselves, help the students learn.
I think back to when I was a young child learning TaeKwonDo (a martial art common in Korea). One of our tests to advance to the next color belt was to break a wooden board with your hand. As a student, this simple test taught me a lot. I remember that the first time I tried to break the board, I failed, despite having mastered the ‘form’ of the move. For example, in TaeKwonDo, when you punch with your right hand, you actually start with you left hand in front of you. You generate power by drawing the left hand into your body as you launch you right hand forward. Stance is also important. You legs should be slightly bent, wider than shoulder length apart, and spread out diagonally for maximum stability. I knew all of this. My form was excellent. But the the first time I struck a board, it didn’t break.
I tried to add raw power the next time (foregoing my learned form), but that didn’t work either. I missed the target and still did not break the board. I finally broke the board when I realized why the form existed. It was not to look good, or to be memorized. Previously, I had always felt like having proper form was like having proper show marching technique in the military (i.e. it was meant to demonstrate and train discipline). But, when I realized that the form was meant to create the hardest most precise hit, that is when I broke the board. The test (i.e., breaking the board) actually taught me. I finished the test with more knowledge and with more excitement about what comes next.
Could tests be designed to capture that feeling more often? Let’s end by thinking about features of an ideal student-centered test (keep in mind that the ideal test is impossible):
Our whole world is designed. But so many of the designs are invisible.
Many of us are discovering this as we start to settle into new work-from-home habits. We are learning that most of our daily habits, the procedures which we use to be more productive, are gone. We are also seeing a lot of the design in our daily worlds that used to be invisible.
As a nation, those lucky enough to continue their work from home are starting to redesign their previously invisible processes. Raise of hands, who has changed their workspace since starting to work from home? Perhaps you found a room with a door so you can close it? Perhaps you shifted to a spot with more natural light? Who has struggled to end the workday? Suddenly, there is no logical separation between work and home. There is no commute, no walk out of the office doors, to mark an end to those hours. These little redesigns have made for amusing stories. For example, talk show host, Stephen Colbert, still dresses in a suit to get into the work mindset despite sitting in a room at home.
Also, Mynoise.net which hosts high quality recordings of many background noises including “Calm office” sounds has become a surprise hit during the pandemic. Feel free to adjust your ‘office’ to have more or less chatty colleagues or air conditioner noise. Or better yet, visit this google doc or this subreddit to see the oddly-specific ‘sound recipes’ that fans of the website have come up with from “Inside a mountaintop cabin on a stormy night” to “Japanese bullet train.”
The need for these daily redesigns, including rethinking what you listen to and what you wear while working from home, defy simple logic. Thinking logically, working from home should improve personal flexibility, allow more comfort, and provide extra time without a commute. As Science Magazine writer, Adam Rubin, states:
I had visions of waking up without an alarm clock, throwing on flannel pants, making French press coffee, and plowing through piles of work—distraction free—next to a window at my dining room table. I’d cook lunch on my own stove instead of microwaving frozen tamales, call in to meetings from a chaise lounge in the backyard instead of languishing in a conference room, and knock off around 4 p.m., basking in the satisfaction of a productive day.1
He believed, as many of us did three weeks ago, that working from home would lead to a happier, healthier, and potentially more productive lifestyle.
But we know now that it is more problematic than that. More complex. People who work from home often realize they must learn to manage their physical workspace (e.g., making sure the desk is comfortable and the lighting is good enough), but also their mental workspace (e.g., learning to separate work life from home life). People who work well from home learn to iterate on their processes in order to discover what feels good, what helps with productivity, and what should work but nonetheless fails (for me, working from bed).
Fortunately, for me, COVID-19 did not change my working style much. As a PhD student, I already worked from home often. I learned to wake up, make coffee, and write first thing. I need long blocks of quiet time to write well, so I write best very early and very late (when not a creature is stirring). Then, when my morning momentum fades, I take a shower, shave, and get dressed to go outside–except I don’t go outside (this was true before COVID-19 as well). But, by getting dressed as if to go outside, I am more likely to take a walk, go buy groceries, or go check the mail later in the day when I am most tired. At the end of the day, at 5pm, I completely shut it off. I have had to learn this because PhD work never ends and guilt follows you everywhere. However, because it is part of my process to end the day at 5pm, I have gotten much better at avoiding irrational guilt.
The last section of my workday is for fun work. At about 9:30pm, I set aside a few hours to work (or not). I only work on stuff I want to work on. If I don’t want to work on anything, I don’t. This is typically a time to write a blog post or a particularly weird computer coding problem. Usually, this is stuff that is down my priority list. This is, “I’ll try to get to it, but no promises” stuff. However, I had to learn to have this time structured (or designed) because, for PhD students, it seems that you can never get to the stuff down the priority list unless you set aside some time for it. And by setting aside some time for fun work, I feel more invigorated and more inspired to slog through some of the less fun work (like, for me, technical writing and formatting).
Lastly, my designed processes do not always work. If I am working on a big particularly boring task, then I do what this haiku by David Dayson mentions:
I work from home, but not at home. I leave. I go pay $5 for a ‘floofy’ coffee or a delicious doughnut and find a place to sit for hours away from home. When work is particularly boring, I need the buzzing world around me to brighten my day. I need to overhear conversations; I need the faded sound of songs I’ve never heard before. Basically, I need to design in some joy to my day.
So next time you think about your daily habits, remember, these habits were designed over time. They are artificial. They are imposed. The way you design your day illustrates an important part of design: process design. It illustrates how small tweaks to habits or schedules can have a disproportionately large effect on productivity or happiness. It also explains why so many feel off-kilter as they adjust to working from home.
The culmination of a complex three year project is finally here. After three years of interviewing, analyzing, and writing, my co-authors, Audrey Amrein-Beardsley and Clarin Collins, and I just published a journal article called “Putting teacher evaluation systems on the map: an overview of states’ teacher evaluation systems post–every student succeeds act” that lays out an overview teacher evaluation systems after the Every Student Succeeds Act passed.
For this article we interviewed state education personnel from every state then combined that information into accessible and impactful tables and maps. Our goal was to present information about the current state of teacher evaluation systems, but also gather the opinions from local policy makers about the strengths and weaknesses of their states’ system.
The world is temporarily closed, so I’ve been working on some projects that were sitting on back burner. You are currently reading one such project: Redesigning my personal website.
Since working on kevinclose.weebly.com back in 2016 (click for a fun blast to the past), my scholarship style has changed. I no longer work through ideas isolated at home with my laptop rather I do much more writing while ideas are in progress. I also try to engage others in the process. Doing so seems to produce better work in the long run. In fact, most of my ideas start as something completely different than how they end (See for example this post on talkingaboutdesign.com).
So, while this new website is primarily a blog updating my work as an education scholar and writer, it may also serve as a place to:
Repost works that inspire, or
Comment on news about education
My exact areas of interest can be hard to capture because, like waves, they constantly shift, dip, swell, and sometimes even crash together. However, in general, my focuses can be boiled down into three wide categories: education, design, and measurement. These categories, when mixed together often branch out into such diverse areas as computational thinking, constructive argument in the classroom, standardized testing pros/cons, educational data mining (or learning analytics), language in testing, and teacher evaluation.
So, please check back for the latest updates and musings on my work.
When not writing about design, Kevin devotes his time to redesigning and reimagining the measurement of student achievement. He takes a practical approach to designing ‘smart’ standardized tests that adapt for diverse students. But also thinks about how and when tests fail to "measure up".