Hi Reader,

This month's newsletter might be the wonkiest newsletter I've ever sent you. It's tackling one of those nitty-gritty questions that comes up all the time, but can never seem to get a straight answer.

When writing survey questions, how many points do you put on the scale?

Lest you think we don't practice what we preach around here, writing this issue caused conversations about questions for multiple studies going into the field this summer. Everything I suggest below? Exactly what we had to consider.

For a look behind the curtain, let's dive in.

Also included is:

Attention Nerds: Related Reading
On the Road Again: Michelle at the VSA Conference

Cheers,

Jessica

Scales that make you go hmmmm....

In a consulting session with a client putting together an event, they asked a very common question. A question so common that, if I had a nickel for every time I’ve heard it, I’d… well, I'd have a lot of nickels.

For a rating scale, how many points should you use?

A rating scale means those questions that ask you to respond on a continuum. It’s often: “How much do you agree or disagree?” But you can rate many, many sentiments on a scale.

And if you start googling this question, everything will probably tell you some version of "It depends!"

So, let's give you a roadmap to create a scale with the right number of points for you – without getting a PhD in measurement.

Before We Begin: There is no right answer.

Seriously, if you meet anyone who tells you there is a “right” way to scale a question… RUN!

There will be better options and worse options in any situation. But there is not a right and a wrong . And there certainly isn’t One Scale to Rule Them All.

In reality, there is a progression of decisions you make that ultimately affect advice on how many points to have. The key is to make those choices explicitly first.

Decision 1: Two Directions or One

The most common scale you see in the wild is an Agreement Scale. You may also hear it called a “Likert Scale” or (amongst the especially precise) “Likert-type.” Think:

Strongly Disagree | Disagree | Neither | Agree | Strongly Agree

Whatever you call it, its most important feature is being a bi-directional scale. It sets up two distinct and opposite poles. All points radiate outward:

One side is how much you disagree
The other side is how much you agree

Here’s another example:

Dissatisfied ⬅️⬅️⬅️➡️➡️➡️ Satisfied

With a bi-directional scale, you are saying: “I expect people could have completely opposite reactions to this thing. I care about which side they are on and how strongly they are on that side.”

The alternative is a single direction scale. In that case, your scale starts at zero. And then it radiates outward, increasing strength until it reaches peak SUPER-DUPER STRONG FEELING.

Think:

Not at All Satisfied ➡️➡️➡️➡️➡️ Extremely Satisfied

Which should you use? A couple of questions to ask yourself:

Does the thing I’m measuring legitimately have opposite characteristics? (Consider frequency. You can’t have negative frequency without a time machine.)

Would people have gradations of feeling on each side? (Could a normal person discern if they slightly versus strongly disagree? And does that matter?)

How likely are you to get responses on both sides? (If virtually everyone is positively inclined, your negative side will be nothing but tumbleweeds.)

This probably feels like a big first decision! But thinking about what you’re trying to measure and the direction it is experienced will make the next steps much easier.

Decision 2: Odd or Even

Let’s swing back to that “Likert Scale” you’ve probably heard about. It is very rigidly one thing:

Strongly Disagree | Disagree | Neither Agree nor Disagree | Agree | Strongly Agree

5 points
Perfect symmetry
Midpoint = “neither"

If you’ve chosen a bi-directional scale, an odd number of points leaves space for a neutral point in the middle. A place for someone to say, “Neither.”

That can be helpful.

It can also be a hiding place that does not mean what you think it means.

Many years back, I was chatting with a colleague who did some intensive scale development work. In testing early versions, she dug into how respondents used the neutral point. It was… not great. I'll paraphrase what I recall her saying:

“They were kinda squirrelly in using that neutral point. It did not mean neither positive nor negative. It meant... other things to them. So we went to six points and made them choose a side.”

An even number of points on the scale forces people to get off the fence. Do you agree or disagree? Pick. Were you satisfied or were you dissatisfied? Pick.

When I’m deciding odd or even, I think about the statement and the scale and ask:

Is there legitimately a perception of “neither” or “both equally”? Go odd.

Am I worried about people using neutral to be Minnesota Nice? Go even.

Decision 3: Words or Numbers

With a scale you also have to decide what to write above those checkboxes.

Anchored scales mean that every point is described in words.

Strongly Disagree | Disagree | Neither | Agree | Strongly Agree

Their rating is relative to their interpretation of the word. We assume that we share roughly the same understanding of those words.

Unanchored scales use numbers instead. Think: "On a scale from 1 to 10…"

Strongly Disagree = 1 2 3 4 5 = Strongly Agree

In these, your respondent places themselves based on the relative position to the end-points. 3 is still the midpoint, but you don't name it neutral.

Why does this matter?

If you want the solidity of words labeling each point, it’s extremely difficult to meaningfully label a lot of points. For anchored scales, you need to stick to a much smaller number of points.

Decision 4: Go Big or Go Small

OK, the number reflects the degree of nuance you expect people to be thinking about. Let’s work our way up:

3 or 4 points: Something almost factual, only slightly more nuance than yes/no. Think:

Did you feel lost while visiting our museum today?

Not at all | A little | Definitely

Speaking for myself here, but I have felt a little lost. And I have felt very lost. But between those extremes, I can't parse my degree of lostness.

And, more importantly, would such nuance matter? When it comes to feeling lost, anything above “a little” is more than you want.

5 points: The most common number used, including by our ol’ dude Likert. This offers 2 gradations per side – strong and regular. Or 4 gradations above zero.

It isn’t overwhelming nuance. And if you want to anchor the points, you can probably come up with five clearly leveled and even words.

6 points: For the nuance of 5 points, but eliminating fence-sitters. Those fence-sitters have two places to land – slightly up or slightly down?

7 points: At this point, you are in bi-polar scale territory and are aiming for more nuance in your data. You are likely considering not anchoring with words.

Another consideration to go big? Comparisons. If you want to compare subsets of people or time, that nuance may detect more subtle changes. Think about it:

Pre-test, I kinda agree. 4 out of 5.
Post-test, I agree more than before! But not, like, strongly. Still 4 out of 5.

That big, honkin’ grain size can’t detect my change. In the spirit of Shark Week: You’re gonna need a bigger scale.

(Or a single direction one. If no one is considering the negative side at pre... why is it there?)

10 to 100 points: At this size, you really want that nuance and it feels right to let the numbers do the work of describing feelings and results. Ultimately, you're aiming for more of a "vibe check" than everyone agrees what "agree" means.

Real World Example:

In the museum world, there's a nifty rating scale called the Overall Experience Rating – or OER. It was developed by researchers at the Smithsonian in the early 2000s, specifically for evaluating reaction to exhibits.

It is a 5-point, anchored scale:

How would you rate your overall experience at ___ today?

Poor
Fair
Good
Excellent
Superior [more recently, Outstanding]

Let's break it down:

Bi-polar
Anchored
The midpoint isn’t neutral, it’s positive!

The genius move was recognizing that people are very positive about the museums they visit and experiences they have. You do good work, and they’re gonna tell you that you do good work!

The OER’s solution was to extend the high end of their scale – adding a level beyond “excellent.”

If someone enjoyed the exhibit experience, they have three distinct and positive options to consider – good, excellent, and outstanding. They can give you an excellent, which is (in fact) excellent! But there’s also space to indicate an experience that was above-and-beyond stupendous.

That nuance can be helpful to staff to find useful trends within super-positive data. Because seeing 80% excellent over and over doesn’t tell you as much as:

20% excellent and 60% outstanding
60% excellent and 20% outstanding

They thought about their context, their audience, their need for using the data. And they created the scale accordingly.

No magic number needed.

What's the most confusing survey question you've ever been asked to fill out? (And did you also hide your confusion in "neither agree nor disagree"? I know I have!)

Nerd Out with Related Reading

If you want to keep thinking about the nitty-gritty of getting your measurement exactly right for your situation, may we suggest:

Grain Size Matters: Aligning those statements and prompts with what people actually experience -- or what can change in the time yo have -- can make all the difference.

When Pre/Post Is Not the Answer: If you are on the pre-post testing train, take a beat to consider if it's actually a good fit for your reality. It is not the only way.

On the Road Again

Where you can next find JSC team members out in the wild:

Visitor Studies Association Conference (July 21-23, Kansas City):

Michelle is putting on her metaphorical cowboy boots and heading to KC next month. Check out her session about Ripple Effects Mapping on Tuesday (2:45) and find out which is her evaluation Hill to Die On on Thursday (10:45).

P.S. Know someone who'd benefit from these ideas? Forward it along!

P.P.S. Get this from a colleague? Sign up to get practical insights like these every month. (Then offer to buy that colleague a beer. They seem like a good egg.)

Why the "Evaluation Therapy" Newsletter?

The moniker is light-hearted. But the origin is real. I have often seen moments when evaluation causes low-key anxiety and dread, even among evaluation enthusiasts. Maybe it feels like a black-box process sent to judge your work. Maybe it’s worry that the thing to be evaluated is complicated, not going to plan, or politically fraught. Maybe pressures abound for a "significant" study. Maybe evaluation gets tossed in your "other duties as assigned" with no support. And so much more.

Evaluation can be energizing! But the reality of the process, methods, and results means it can also feel messy, risky, or overwhelming.

I've found that straightforward conversation about the realities of evaluation and practical solutions can do wonders. Let's demystify the jargon, dial down the pressure, reveal (and get past) barriers, and ultimately create a spirit of learning (not judging) through data.

This newsletter is one resource for frank talk and learning together, one step at a time.

Learn more about JSC and our team of evaluators. Or connect with us on LinkedIn:

You are receiving this email because you signed up for our newsletters somewhere along the line. Changed your mind? No hard feelings. Unsubscribe anytime.

Wanna send us some snail mail? J. Sickler Consulting, 100 S. Commons, Suite 102, Pittsburgh, PA 15212

The Evaluation Therapy Newsletter

How many points do you put on the damn scale?

Scales that make you go hmmmm....

Nerd Out with Related Reading

On the Road Again