How do you measure, measure a year?
Measurement is a word that gets used a lot in evaluation circles.
It's also a word that gives me the icks. It conjures a mental image of someone with a stern expression, carrying a yardstick, who either says you're up to snuff or raps your knuckles.
That is not who I wanted to be when I grew up.
But measurement is central to our work. If you want to know how much, how far, how long, or how strongly something is happening because of your work... well, you need some variety of measurement.
More importantly, it doesn't need to be so intimidating or arbitrary-feeling. It just takes a little extra time in the planning stage.
One of the biggest things that separates an arbitrary yardstick from a targeted one -- and whether it's successful for you -- is grain size.
Let's break it down.
What do we mean by grain size?
You know that expression about knowing the difference between the forest and the trees? That's grain size. I’m referring to the degree of specificity in what you’re trying to measure.
A lot of times, educators are dealing with outcome language that is grand and sweeping. It's that language from grant proposals. It's big. It's impressive. It brings tears to your Program Officer's eyes. It gets the big fat check in the door.
And that is usually a really big grain size.
It's a great place to start.
It's a terrible thing to try and measure.
Once you (and your evaluator) are ready to think about measurement, you pretty quickly realize that sweeping statement is way too big. Trying to measure at that grain size is a recipe for disappointment.
You need a smaller – and more specific – grain size.
The tricky part here is getting beyond hand-wavy outcome platitudes. Now you have to get serious about what you really do well within that big outcome's zone.
Let’s imagine a hierarchy of related outcomes you could articulate for an environmental education program. It could influence:
- Environmental sensitivity
- Caring for nature
- Caring about this habitat
- Perceptions of this place we just explored
See the differences in each of those?
The more specific you can get about what change looks like in your learners, the better off you will be in the long run.
How do you figure out the grain size of a measurement tool?
When you're facing measurement time, Google Scholar is a place we all often turn. There are many pre-existing tools. Many are well-researched. My hat is off to the scholars who develop these tools. We thank you.
But a lot of tools you find in literature are operating at a giant grain size. They measure Big Psychological Concept (TM).
Michelle located a tool for measuring awe recently that we were considering for a project. They had to break awe into 6 pieces, because awe is complicated. And even the pieces are big picture. (Think: sense of vastness, self-loss, etc.)
Just because it's a well-researched tool, doesn't mean it's the right tool for you.
Especially if you’re in an environment with pressure to prove outcomes, this is not the time to shrug and say, “any survey will do.” You have to measure the measurement tool.
This is my smell test: Imagine yourself in the shoes of your learners. Read the survey questions carefully. Does it speak to their experience in your program? Does it use their language? Does it feel like things they would say after working with you? And that they would not have already been saying before they met you?
If it doesn't pass the smell test, it's often too big to align with what you do.
But aren't those valid measures??
Yes, they are. But if it failed the smell test, be prepared for results to feel really… meh. And you'll have two options to deal with "meh" results.
Interpretation Option 1: You failed. You need to overhaul the program to pass the test you selected.
I’m not gonna lie, that option makes me gag a little. I don't like informal educators teaching to a damn test.
Interpretation Option 2: You measured an outcome over here, when the real outcome is over there. Or, more likely, your outcomes are happening at a smaller grain size of whatever you tried to measure.
So, how do you dig deeper?
I advise starting with the ears closest to the ground -- the educators who work with learners. Find out examples of what they hear or see that indicates good learning is happening.
This anecdote is likely the tiniest grain size.
Now, you can look for patterns to pull it up a notch or two in size. What commonalities do you find? What is a slightly more general way of saying it? Is there a way to translate that into a tool to capture those takeaways? Or find a tool that speaks to that?
Admittedly, this is not always easy. But the closer you get to the reality of what change looks like in your setting, the more satisfying your data is going to be. And the better positioned you will be to tell your impact story.
What if we just don’t know?!?
That tells me you shouldn’t jump to measure anything yet. Take a beat and explore. Spend time prying open the black box, until you get a sense of what you need to measure.
A little exploration is a worthwhile investment if you are otherwise grasping at straws when it comes to digging into your outcomes.
Real World Example:
We were evaluating an environmental education field trip. The team was interested in whether they had boosted "environmental sensitivity."
I will be honest, I was not 100% clear what that meant in the context of a half-day field trip and 12 year-olds. I asked a lot of questions about what they did, what they heard, what they aimed for -- really -- with their kids.
In the final approach, we tried measuring things two different ways:
- An environmental attitudes scale from the literature that aligned with big-picture Care for Nature (general);
- A question we piloted that focused on associations with This Environment (specific).
We measured pre-program and post-program.
Environmental attitudes? *sad trombone* It wasn't that kids didn't care about nature. They were fairly pro-Nature to begin with. And one field trip didn't make kids into more of a tree-hugger than they were when they started.
Perceptions of this environment? *happy trumpet* We saw shifts in certain perceptions of this environment -- ones that aligned with the educators' aims. Positive perceptions went up. Negative went down.
There were two ways that we shrunk the grain size: We looked a perceptions, rather than attitudes. And we looked at this particular environment, rather than the environment as a global concept.
Anyone else ever struggled to find the perfect grain size to measure your story? Reply and share!