assessment – A Thaumaturgical Compendium

Do online classes suck?

alex — Sat, 08 Dec 2012 05:24:46 +0000

Before arriving at my current posting, I would have thought the idea that online classes compared poorly to their offline counterparts was one that was slowly and inevitably fading away. But a recent suggestion by a colleague that we might tell incoming freshmen that real students take traditional meatspace courses and those just interested in a diploma go for the online classes caught me a bit off-guard.

I want to be able to argue that online courses are as good as their offline counterparts, but it’s difficult, because we don’t really know that. And this is for a lot of reasons.

The UoP Effect

First, if traditional and elite universities had been the originators of successful online courses and degrees, or if they had promoted those successes better (since I suspect you can find some pretty substantial successes reaching back at least three decades), we wouldn’t have the stigma of the University of Phoenix and its kin. For many, UoP is synonymous with online education, particularly in these parts (i.e., Phoenix).

Is UoP that bad? I don’t know. All I have to judge them on is people I’ve met with UoP degrees (I was not at all impressed), and what I’ve heard from students. What I do know is that they spend a lot of money on advertising and recruiting, and not very much money on faculty, which to me suggests that it is a bad deal.

Many faculty see what UoP and even worse for-profit start-ups are doing and rightly perceive it as a pretty impoverished model for higher education. They rightly worry that if their own university becomes known for online education, it will carry the same stigma a University of Phoenix degree does.

The Adjuncts

At ASU, as with many other research universities, the online courses are far more likely to be taught by contingent faculty rather than core tenure-track faculty, and as a result the students are more likely to end up with the second-string. I’ll apologize for demeaning adjuncts: I know full well that if you stack up the best teachers in any department there is a good chance that adjuncts will be among them, or even predominate. But on average, I suspect that a class taught by an adjunct instructor is simply not as good as one taught by full-time research faculty. There are a lot of reasons for this, but perhaps the most important one is that they do not have the level of support from the university that regular faculty do.

I’ve been told by a colleague here that they wanted to teach in the online program but were told that they were “too expensive” to be employed in that capacity. And there is a model that is beginning to separate out course design, “delivery”(ugh!) or “facilitation,” and evaluation. But I suspect the main reason more full-time faculty don’t teach online is more complicated.

Online is for training, not complex topics

This used to be “Would you trust a brain surgeon with an online degree?” which is actually a pretty odd question. Brain surgeons in some ways have more in common with auto mechanics than they do with engineers, but the point was to test whether you would put yourself in mortal danger if you were claiming online education was good. Given how much surgery is now done using computer-controlled tools, I think some of that question is moot now, but there remains this idea that you can learn how to use Excel online, but you certainly cannot learn about social theory without the give-and-take of a seminar.

It’s a position that is hard for me to argue against, in large part because it’s how almost all of us in academia learned about these things. I too have been taught in that environment, and for the most part, my teaching is in that environment. As one colleague noted, teaching in a physical classroom is something they have been taught how to do and they have honed their craft; they do it really well. Why are they forced to compete for students with online courses when they know they would not be as effective a teacher in that environment?

But in many ways this is a self-fulfilling prophecy. Few schools require “traditional” faculty to teach online, though they may allow or even encourage it. As a result the best teachers are not necessarily trying to figure out how to make online learning great. We are left with the poor substitute of models coming from industry (modules teaching employees why they should wear a hair net) and the cult of the instructional designer.

Instructional Designers

As long as I’ve already insulted adjuncts, I’ll extend to instructional designers. I know a lot of brilliant ones, but the “best practices” make online education into the spoon-feeding idiot-proof nonsense that many faculty think it is. It is as if the worst of college education has been simmered until you get it down to a fine paste, and this paste can be flavored with “subject expertise.” Many are Blackboard personified.

When you receive a call–as I recently did–for proposals to change your course so that it can be graded automatically, using multiple guess exams and the like, it makes you wonder what the administration thinks good teaching is.

I am a systematizer. I love the idea of learning objectives aligned with assessments and all that jazz. But in sitting through a seminar on Quality Matters recently, we found ourselves critiquing a course that encouraged participation on a discussion board. How did discussion align with the learning objectives? It didn’t. OK, let’s reverse engineer it. How can you come up with a learning objective, other than “can discuss matters cogently in an online forum” that encourages the use of discussion-based learning. Frankly, one of the outcomes of discussion is a personalized form of learning, a learning outcome that really comes out as “Please put your own learning outcome here, decided either before or after the class.” Naturally, such a learning outcome won’t sit well with those who follow the traditional mantra of instructional design.

QM has its heart in the right place: it provides a nice guideline for making online courses more usable, and that’s important. But what is vital is making online spaces worthy of big ideas, and not just training exercises.

The Numbers

I like the idea of the MOOC, and frankly, it makes a lot of sense for a lot of courses. It’s funny when people claim their 100-student in-person class is more engaging than a 1,000-student online course. In most cases, this is balderdash. Perhaps it is a different experience for the 10 people who sit up front and talk, but generally, big classes online are better for more students than big classes off.

Now, if you are a good teacher, chances are you do more than lecture-and-test. You get students into small groups, and they work together on meaningful projects, and the like. Guess what: that’s true of the good online instructors as well.

I think you can create courses that scale without reducing them to delivery-and-test. ASU is known for doing large-scale adaptive learning for our basic math courses, for example, and I think there are models for large-scale conversation that can be applied to scalable models for teaching. It requires decentering the instructor–something many of my colleagues are far from comfortable with–but I am convinced highly scalable models for interaction can be developed further. But scalable courses aren’t the only alternative.

I think the Semester Online project, which allows students from a consortium of universities to take specialized small classes online, is a great way to start to break the “online = big” perception. Moreover, you can make small online course materials and interactions open, leading to a kind of TOOC (Tiny Open Online Course) or a Course as a Fishbowl.

Assessment as Essential

I’ll admit, I’m not really a big part of the institutionalized assessment process. But it strikes me as odd that tenure, and our continued employment as professors, is largely based on an assessment of the quality of our research, not just how many papers we put out–though of course, volume isn’t ignored. On the other hand, in almost every department in the US, budgeting and success is based on FTEs: how can you produce more student hours with less faculty hours. Yes, there is recognition for effective and innovative teaching. But when the rubber hits the road, it’s the FTEs that count.

Critics of online education could be at least quieted a bit if there were strong structures of course and program assessment. Not just something that gets thrown out there when accreditation comes up, but something that allowed for the ongoing open assessment of what students were learning in each class. This would change the value proposition, and make us rethink a lot of our decisions. It would also provide a much better basis for deciding on teachers’ effectiveness (although the teacher is only one part of what leads to learning in a course) than student evals alone.

This wouldn’t fix everything. It may very well be that people learn better in small, in-person classrooms, but that it costs too much to do that for every student or for every course. The more likely outcome, it seems to me, is that some people learn some things better online than they do offline. If that’s the case, it would take the air out of the idea that large institutions are pursuing online education just because it is better for their bottom line.

In any case, the idea that we are making serious, long-term investments and decisions in the absence of these kinds of data strikes me as careless. Assessment doesn’t come for free, and there will be people who resist the process, but it seems like a far better metric of success than does butts in seats.

BlogPost Progress Report: peer assessment

alex — Wed, 09 May 2012 15:54:00 +0000

Over the last four semesters, beginning in the spring of 2011, I have been using a badge system that allows for peer review and the awarding of badges that can then be shared on the open badge infrastructure. As with many of my experiments with educational technologies, I figured the best way to learn what works is just to dive in and muddle through. I initially intended to start without any specific infrastructure, just running through the process via a wiki, but instead I coded a simple system for managing the badge process, and have tweaked it over time.

It doesn’t really work, but it works well enough, and thanks to some patient and very helpful students, I now know a great deal more about how badges can work in higher education. I make no claim to my successes being best practices, but I at least know more now than when I started, and figured I would share some of this experience.

Why did you do that?

More than a decade ago, I coded my first blog system for a course, though the term was not widely used then. I did it because there were particular kinds of interactions I wanted to encourage, and existing applications didn’t do quite what I wanted them to. I created my BadgePost system for the same reason. I am not really a coder (I dabble) but what I wanted did not exist, and so I took a shot at prototyping something that might work. (As an aside, I also hope that what happened with blogs happens with badges, and I can download the equivalent of WordPress soon instead of having to roll my own.) I knew I wanted:

Peer assessment. I wanted to get myself out of the sole role of sole reviewer. In many cases peers can give better advice than I can. One of the main difficulties of teaching is rewinding to the perspective of the student, and that can be easier, in some cases, for those who have just learned something. I wanted to enable that kind of open peer review in both hybrid courses and those taught entirely online.

Mastery. I also wanted desperately to get away from letter grades, as they seemed like a plague, not just for undergrad courses, but for grad as well. Students seemed far more interested in the grade than they were in learning something, a refrain I’ve heard frequently from a lot of my colleagues. I wanted to move the focus off of the grade.

Peers as cases. Students often ask me for models of good work, and because I change assignments so frequently, I rarely have a “model.” The advantage to open assessment that travels beyond a single course is that there are exemplars to look at, and (hopefully) they are diverse enough not to stifle creative interpretations by new students.

Unbundling the credential from the course. I had a number of problems that seemed to swirl around the equation of course time to learning objectives. For one, in the required technical courses, some people came in with nothing and others with extensive knowledge, and I wanted to try to address the issue of not all students moving through a program in lock-step. I wanted a back door to reduce redundancy and have instructors know that their students were coming into a course with certain skills. Finally, I wanted to give students a range of choices so that they could pursue the areas they were most interested in.

I also wanted non-paying non-Quinnipiac students participating in my courses to have a portable credential to show for it. And I wanted paying, matriculating students to have an easier way of communicating the kinds of things they had learned in the program.

I won’t cover all of these in detail, but will expound a bit more on the assessment and assessing piece…

Peer Assessment

There have been suggestions that the credentialing aspect of badges is separate from the process of assessment that leads to the badge, but in practice I think it’s both likely that they get rolled together, and beneficial when they are. Frankly, students don’t see the distinction, and they can reinforce each other in interesting ways. So, while I have done peer critique in the past, from the outset here, I wanted to get students involved in the process of granting badges via peer critique.

A lot of this was influenced by discussions with Philipp Schmidt and the application of badges in Peer2Peer University. I have long stated the goal of “disappearing” as an instructor in a course, and the place where that appearance is most obvious is when it comes to grading. (And assessment, not the same thing, but bound together.) From the outset, I saw the authority of a badge as vested in the material presented as evidence of learning, and the open endorsement/assessment of that work by peers.

Lots of reasons for this, but part was as a demotivator. That is, my least favorite question on the first day of classes is “how do I get an A?” I am always tempted to tell the truth: “I don’t care, and I wish you didn’t either.” So, I wanted badges to provide a way of getting away from that linear grading scale. I went so far as to basically throw grades out, saying that if you showed up on something approaching a regular basis, you’d get an A.

I should say that this was a failure. If anything, students paid more attention to grades because the unique system made them have to think about it. It wasn’t onerous, but a lot more of the course became about the assessment process. And it’s funny, my desire to escape grading as a focus and process turned a 180, and I am now all about assessment. I should explain…

I hate giving traditional tests (I don’t think they show anything), and hate empty work. And while I now know I like ideas around authentic assessment, from the outside these seemed a lot like more of the same. Now, not only do I think formative assessment is the key element of learning, but that the skill of assessing work in any field is what essentially defines expertise. Being able to tell what constitutes good work allows you to improve the work of others, and importantly, of yourself. At the core of teaching is figuring out what in a piece of work is good, what needs improvement, and how the creator can improve her work.

Beyond Binary

I had expected students to do the work, apply for a badge, and then either get it or not. A lot of other people new to badges seem to have a similar expectation. Just the opposite occurred, and a lot of the changes to my badge system have been to accommodate this.

First, a lot of work that really was not ready for a badge was submitted. I kind of expected students to be very sure of the work that they submitted for a badge, in part because of my experience with blogging in classes, and seeing that students were more careful about their writing when it was for a peer audience. Instead, students often presented work that was not enough for a badge, or barely enough for a badge. I was pleasantly surprised by how much feedback, and in what detail, students gave to their peers.

One of the more concrete changes I made to the system was to move from a binary endorsement (qualified or not, on a number of factors), to a sliding scale, with the center point being passing, and the ability of reviewers to come back and revise their “vote.” As a result, you can see from the evidence of a badge not just what the student has done, but whether their peers thought this was acceptable or awesome.

I’ve also been surprised by how many nominated themselves for “aspirational” badges. When a user selects a badge, it is moved into their “pending” category, and I was confused by so many pending badges that had no evidence uploaded. But students seem to click on these as a kind of note to themselves that this is what they are pursuing. This, incidentally, leads to a problem for reviewers who look at a pending badge before it is ready, and find that process frustrating, but one of the things that needs to improve in the system is communicating such progress. I didn’t plan to need to do that, since I saw badges as an end point rather than a process.

The Reappearing Teacher

The other surprise was just how interested students were in getting my imprimatur. But the reason, in this case, was not the grade–they had that. They actually valued my response as an expert a bit more, I think. This was a refreshing change from students turning to the back page of graded paper to see the grade, and then throwing it out before reading any of my comments. No doubt, some of this comes from a lack in confidence in their peers as well, and I’ve found that in some cases this lack is reasonable.

In some ways, I’m trying to encourage the sempai/kohai relationship, of those who have “gone before” and therefore have more to say about a particular badge. I’ve been reluctant to limit approval to only those who actually have the badge (in part for reasons I’ll note below regarding encouraging reviews), but I may do more of that. There are some kinds of assessment, though, that don’t require having the badge. I don’t need to know how to create a magic trick to be amazed by it, for example. So I don’t want to rule out this kind of “audience assessment.” There is also space for automated assessment. For example, for some badges you need to show a minimal number of tweets, or comments, or responses to comments, or (e.g.) valid HTML. There is no reason to have a human do these pieces of the assessments, though I would hate to see badges that did not involve human assessment, in large part because, again, I think building the capacity to do assessments is an important part of the system.

The Other Motivation

I began by hoping students would ignore the grading process, and have evolved to think that they should pay a lot of attention to assessment. In some courses, students have jumped into peer assessment. In others–and particularly the undergraduate course I’m teaching this semester–they were slow to get started. I want to think about why people assess, and how to motivate them to be involved.

When I did peer assessments in the pre-badge world, I assigned a grade for the quality of the assessment provided. I want to do something similar here, and a lot of this comes of a discussion with Philipp Schmidt in Chicago last year. The meta-project here is getting students to be able to analytically assess work and communicate that. Yes, you could do an “expert assessor” badge, or something similar, but really it is more essential to the overall project.

One way to do this is inter-coder reliability. If I am considered an expert in the area (and in the current system, this is defined as having badges at a higher level than the one in question, within the same “vertical”), those with less experience should be able to spot the same kinds of things I do, and arrive at a similar quantitative result on the assessments.

So, for example, if someone submits the write-up of a content analysis, two of her peers might look at it and come up with two very different assessments of the methods section of the article. Alice may say that it is outstanding, 90/100 on the scale of a particular rubric. Frank might disagree, putting it at 25/100. Of course, both would provide some textual explanation for why they reached these conclusions. Then I come along and give it a 30/100, along with my own critique.
The dynamics of getting students to do peer assessments (some courses they did a lot, some they have not), and my involvement in the assessment, is an interesting piece for me. In this case, Frank should receive some sort of indication within the system that he has done a good job of performing the assessment.

I’m still working out a way to do this that isn’t unnecessarily complex. Right now there is a karma system that gives users karma for performing assessments, with multipliers for agreeing with more experienced assessors, but this is complicated to “tune” and non-intuitive.

There is also the issue of when various levels perform the assessment. For the above process to work, Alice and Frank both need to get their assessments in before I do, and shouldn’t get the same kind of kudos for “me too” assessments after the fact.

Badges

None of this is necessarily about badges, but it leaves a trail of evidence, conversation, and assessment behind. One of the big questions is whether badge records should be formative or summative. As I said, the degree to which students have engaged in badges as a process rather than an outcome came as a bit of a surprise to me. Right now, much of that process happens pretty openly, but I can fully understand how someone well on in their career may not want to expose fully their learning process. (“May” is operative here–I think doing so is valuable for the learning community!)

On the other hand, I think badges that appeal to authority undermine the whole reason badges are not evil. Badges that make an authoritative appeal (“Yale gave me this badge so it must be good.”) simply reinforce many of the bad structures of learning and credentialing that currently exist. Far better is a record of the work done to show that you understand something or can do something, along with the peers that helped you get there, pointed to and easily found via a digital badge.

Balancing the privacy needs with the need to authentically vest the badge with some authority will be an interesting feat. I suspect I may provide ways of hiding “the work” and only displaying the final version (and final critiques) to the outside world, while preserving the sausage-making process for the learning community itself. But this remains a tricky balance.

Buffet Evals

alex — Thu, 03 May 2012 03:16:06 +0000

â€œLeon Rothberg, Ph.D., a 58-year-old professor of English Literature at Ohio State University, was shocked and saddened Monday after receiving a sub-par mid-semester evaluation from freshman student Chad Berner. The circles labeled 4 and 5 on the Scan-Tron form were predominantly filled in, placing Rothbergâ€s teaching skill in the ‘below average’ to ‘poor’ range.”

So begins an article in what has become one of the truthiest sources of news on the web. But it is no longer time for mid-semester evals. In most of the US classes are wrapping up, and professors are chest-deep in grading. And the students–the students are also grading.

Few faculty are great fans of student evaluations, and I think with good reason. Even the best designed instruments–and few are well designed–treat the course like a marketing survey. How did you feel about the textbook that was chosen? Were the tests too hard? And tell us, were you entertained?

Were the student evals used for marketing, that would probably be OK. At a couple of the universities where I taught, evals were made publicly available, allowing students a glimpse of what to expect from a course or a professor. While that has its own problems, it’s not a bad use of the practice. It can also be helpful for a professor who is student-centered (and that should be all of us) and wants to consider this response when redesigning the course. I certainly have benefited from evaluations in that way.

Their primary importance on the university campus, however, is as a measure of teaching effectiveness. Often, they are used as the main measure of such effectiveness. Especially for tenure, and now as many universities incorporate more rigorous post-tenure evaluation, there as well.

Teaching to the Test

A former colleague, who shall remain nameless, noted that priming the student evals was actually pretty easily done, and started with the syllabus. You note why your text choice is appropriate, how you are making sure grading is fair, indicate the methods you use to be well organized and speak clearly, etc. Throughout the semester, you keep using the terms used on the evals to make clear how outstanding a professor you really are. While not all the students may fall for this, a good proportion would, he surmised.

(Yes, this faculty member had ridiculously good teaching evaluations. But from what I knew, he was also an outstanding teacher.)

Or you could just change your wardrobe. Or do one of a dozen other things the literature suggests improves student evaluations.

Or you could do what my car dealership does and prominently note that you are going to be surveyed and if you can’t answer “Excellent” to any item, to please bring it to their attention so they can get to excellent. This verges on slimy, and I can imagine, in the final third of the semester, that if I said this it might even cross over into unethical. Of course, if I do the same for students–give them an opportunity to get to the A–it is called mastery learning, and can actually be a pretty effective use of formative assessment.

Or you could do what an Amazon seller has recently done for me, and offer students $10 to remove any negative evaluations. But I think the clearly crosses the line both in Amazon’s case and in the classroom. (That said, I have on one occasion had students fill out evals in a bar after buying them a pitcher of beer.)

It is perhaps a testament to the general character of the professoriate that in an environment where student evaluations have come to be disproportionately influential on our careers, such manipulation–if it occurs at all–is extremely rare.

It’s the nature of the beast, though: we focus on what is measured. If what is being measured is student attitudes toward the course and the professor, we will naturally focus on those attitudes. While such attitudes are related to the ability to learn new material, they are not equivalent.

Doctor Feelgood

Imagine a hospital that promoted doctors (or dismissed them) based largely on patient reviews. Some of you may be saying “that would be awesome.” Given the way many doctors relate to patients, I am right there with you. My current doctor, Ernest Young, actually takes time to talk to me, listens to me, and seems to care about my health, which makes me want to care about my health too. So, good. And frankly, I do think that student (and patient) evaluation serves an important role.

But–and mind you I really have no idea how hospitals evaluate their staff–I suspect there are other metrics involved. Probably some metrics we would prefer were not (how many patients the doctor sees in an hour) and some that we are happy about (how many patients manage to stay alive). As I type this, I strongly suspect that hospitals are not making use of these outcome measures, but I would be pleased to hear otherwise.

A hospital that promoted only doctors who made patients think they were doing better, and who made important medical decisions for them, and who fed them drugs on demand would be a not-so-great place to go to get well. Likewise, a university that promotes faculty who inflate grades, reduce workload to nill, and focus on entertainment to the exclusion of learning would also be a pretty bad place to spend four years.

If we are talking about teaching effectiveness, we should measure outcomes: do students walk out of the classroom knowing much more than they did when they walked in? And we may also want to measure performance: are professors following practices that we know promote learning? The worst people to determine these things: the legislature. The second worst: the students. The third worst: fellow faculty.

Faculty should have their students evaluated by someone else. They should have their teaching performance peer reviewed–and not just by their departmental colleagues. And yes, well designed student evaluations could remain a part of this picture, but they shouldn’t be the whole things.

Buffet Evals

I would guess that 95% of my courses are in the top half on average evals, and that a slightly smaller percentage are in the top quarter. (At SUNY Buffalo, our means were reported against department, school, and university means, as well as weighted against our average grade in the course. Not the case at Quinnipiac.) So, my student evals tend not to suck, but there are also faculty who much more consistently get top marks. In some cases, this is because they are young, charming, and cool–three things I emphatically am not. But in many cases it is because they really care about teaching.

These are the people who need to lead reform of the use of teaching evaluation use in tenure and promotion. It’s true, a lot of them probably like reading their own reviews, and probably agree with their students that they do, indeed, rock. But a fair number I’ve talked to recognize that these evals are given far more weight than they deserve. Right now, the most vocal opponents to student evaluations are those who are–both fairly and unfairly–consistently savaged by their students at the end of the semester.

We need those who have heart-stoppingly perfect evaluations to stand up and say that we need to not pay so much attention to evaluations. I’m not going to hold my breath on that one.

Short of this, we need to create systems of evaluating teaching that are at least reasonably easy and can begin to crowd out the student eval as the sole quantitative measure of teaching effectiveness.

Badges: The Skeptical Evangelist

alex — Tue, 06 Mar 2012 06:53:57 +0000

I have been meaning to find a moment to write about learning badges for some time. I wanted to respond to the last run of criticisms of learning badges, and the most I managed was a brief comment on Alex Reidâ€s post. Now, with the announcement of the winners of this yearâ€s DML Competition, there comes another set of criticisms of the idea of badges in learning. This isnâ€t an attempt to defend badges–I donâ€t think such a defence is necessary. It is instead an attempt to understand why they are worthy of such easy dismissal by many people.

Good? Bad?

My advisor one day related the story of a local news crew that came to interview him in his office. This would have been in the mid-1990s. The first question the reporter asked him was: “The Internet: Good? Or Bad?”

Technologies have politics, but the obvious answer to that obvious question is “Yes.” Just as when people ask about computers and learning, the answer is that technology can be a force for oppressive, ordered, adaptive multiple-choice “Computer Aided Teaching,” or it can be used to provide a platform for autonomous, participatory, authentic interaction. If there is a tendency, it is one that is largely reflective of existing structures of power. But that doesn’t mean you throw the baby out with the bathwater. On the whole, I think computers provide more opportunities for learning than threats to it, but I’ll be the first to admit that outcome was neither predestined nor obvious. It still isn’t.

Are there dangers inherent to the very idea of badges? I think there are. I’ve written a bit about them in a recent article on the genealogy of badges. But just as I can find Herb Schillerâ€s work on the role of computer technology in cultural hegemony compelling, but still entertain its emancipatory possibilities, I can acknowledge that badges have a long and unfortunate past, and still recognize in them a potential tool for disrupting the currently dominant patterns of assessment in institutionalized settings, and building bridges between informal and formal learning environments.

Ultimately, what is so confusing to me is that I agree wholeheartedly with many of the critics of badges, and reach different conclusions. To look at how some badges have been used in the past and not be concerned about the ways they might be applied in the future would require a healthy amount of selective perception. I have no doubt that badges, badly applied, are dangerous. But so are table saws and genetic engineering. The question is whether they can also be used to positive ends.

Over the last year, I’ve used badges to such positive ends. My own experience suggests that they can be an effective way of improving and structuring peer learning communities and forms of authentic assessment. I know others have had similar successes. So, I will wholeheartedly agree with many of the critics: badges can be poorly employed. Indeed, I suspect they will be poorly employed. But the same can be said of just about any technology. The real question is if there is also some promise that they could represent an effective tool for opening up learning, and providing the leverage needed to create new forms of assessment.

Gold Stars

One of the main critiques of badges suggests that they represent extrinsic forms of motivation to the natural exclusion of intrinsic motivation. Mitch Resnick makes the case here:

I worry that students will focus on accumulating badges rather than making connections with the ideas and material associated with the badges â€“ the same way that students too often focus on grades in a class rather than the material in the class, or the points in an educational game rather than the ideas in the game.

I worry about the same thing. I will note in passing that at worst, he is describing a situation that does no harm: replacing a scalar (A-F letter grades) with a system of extrinsic motivation that is more multidimensional. But the problem remains: if badges are being used chiefly as a way of motivating students, this is probably not going to end well.

And I will note that many educators I’ve met are excited about badges precisely because they see them as ways of motivating students. I think that if you had to limit the influences of using badges to three areas, they would be motivation, assessment, and credentialing. The first of these if often seen as the most important, and not just by the “bad” badgers, but by many who are actively a part of the community promoting learning badges.

(As an aside, I think there are important applications of badges beyond these “big three.” I think they can be used, for example, as a way for a community to collaboratively structure and restructure their view of how different forms of local knowledge are related and I think they can provide a neophyte a map of this knowledge, and an expert a way of tracing their learning autobiography over time. I suspect there are other implications as well.)

Perhaps my biggest frustration is the ways in which badges are automatically tied to gamification. I think there are ways that games can be used for learning, and I know that a lot of the discussion around badges comes from their use in computer games, but for a number of reasons I think the tie is unfortunate; not least, badges in games are often seen primarily as a way of motivating players to do something they would otherwise not do.

Badges and Assessment

The other way in which I worry about computer gaming badges as a model is the way they are awarded. I think that both learning informatics and “stealth assessment,” have their place, but if misapplied they can be very dangerous. My own application of badges puts formative assessment by actual humans (especially peers) at the core. Over time I have come to believe that the essential skill of the expert is an ability to assess. If someone can effectively determine whether something is “good”–a good fit, a good solution, aesthetically pleasing, interesting, etc.–she can then apply that to her own work. Only through this critical view can learning take place.

For me, badges provide a framework for engaging effectively in assessment within a learning community. This seems also to be true for Barry Joseph, who suggests some good and bad examples of badge use here. Can this kind of re-imagination of assessment happen outside of a “badge” construct? Certainly. But badges provide a way of structuring assessment that provides scaffolding without significant constraints. This is particularly true when the community is involved in the continual creation and revision of the badges and what they represent.

Boundary Objects

Badges provide the opportunity to represent knowledge and power within a learning community. Any such representation comes with a dash of danger. The physical structuring of communities: who gets to talk to whom and when, where people sit and stand, gaze–all these things are dangerous. But providing markers of knowledge is not inherently a bad thing, and particularly as learning communities move online and lose some of the social and cultural context, finding those who know something can be difficult.

This becomes even more difficult as people move from one learning community to another. Georg Simmel described the intersection of such social circles as the quintessential property of modern society. You choose your circles, and you have markers of standing that might travel with you to a certain degree. We know what these are: and the college degree is one of the most significant.

I went to graduate school with students who finished their undergraduate degrees at Evergreen State College, and have been on admissions committees that considered Evergreen transcripts in making admissions decisions. Evergreen provides narrative assessments of student work, and while I wholeheartedly stand by the practice–as a great divergence if not a model–it makes understanding a learning experience difficult for those outside the community. Wouldn’t it be nice to have a table of contents? A visual guide though a learning portfolio and narrative evaluation? A way of representing abilities and learning to those unfamiliar with the community in which occurred?

I came to badges because I was interested in alternative ways of indicating learning. I think that open resources and communities of learning are vitally important, but I know that universities will cling to the diploma as a source of tuition dollars and social capital. Badges represent one way of nibbling at the commodity of the college diploma.

Badges, if done badly, just become another commodity: a replacement of authentic learning with an powerful image. To me, badges when done well are nothing more than a pointer. In an era when storing and transmitting vast amounts of content is simple, there is no technical need for badges as a replacement. But as a way of structuring and organizing a personal narrative, and relating knowledge learned in one place to the ideas found in another, badges represent a bridge and a pointer.

This is one reason I strongly endorsed the inclusion of an “evidence” url in the Mozilla Open Badge Infrastructure schema. Of course, the OBI is not the only way of representing badges, nor does it intend to represent only learning badges–there is a danger here of confusing the medium and the message. Nonetheless, it does make for an easier exchange and presentation of badges, and importantly, a way of quickly finding the work that under-girds a personal learning history.

All the Cool Kids Are Doing It

Henry Jenkins provides one of the most compelling cases against badges I’ve seen, though it’s less a case against badges and more a case against the potential of a badgecopalypse, in which a single sort of badging system becomes ubiquitous and totalizing. Even if such a badge system followed more of the “good” patterns on Barry Joseph’s list than the “bad,” it would nonetheless create a space in which participation was largely expected and required.

Some of this comes of the groups that came together around the badge competition. If it were, like several years ago, something that a few people were experimenting with on the periphery, I suspect we would see little conversation. But when foundations and technologists, the Department of Education and NASA, all get behind a new way of doing something, I think it is appropriate to be concerned that it might obliterate other interesting approaches. I share Jenkins’ worry that interesting approaches might easily be cast aside by the DML Competition (though I will readily concede that may be because I was a ~~loser~~“unfunded winner” in the competition) and hope that the projects that move forward do so with open, experimental eyes, allowing their various communities to help iteratively guide the application of badges to their own ends. I worry that by winnowing 500 applications to 30, we may have already begun to centralize what “counts” in approaches to badges. But perhaps the skeptical posts I’ve linked to here provide evidence of the contrary: that the competition has encouraged a healthy public dialog around alternative assessment, and badges represent a kind of “conversation piece.”

Ultimately, it is important that critical voices of approaches to badges remain at the core of the discussion. My greatest concern is that the perception that there are badge evangelists and skeptics is in fact true. I certainly think of myself as both, and I hope that others feel the same way.

Rank Teacher Ranking

alex — Fri, 24 Feb 2012 17:57:48 +0000

There has been a little discussion on an informal email list at my university about the Op-Ed by Bill Gates in the New York Times that argues against public rankings of teachers. It’s a position that in some ways constrains the Gates Foundation’s seeming interest in quantifying teaching performance. It led to questions we have tried to face about deciding merit in teaching, and encouraging teaching excellence at our own institution. I obviously won’t post the stream, but here’s my response to some of the discussion:

The problem with ranking is that it suggests that excellence in teaching is a uni-dimensional construct, which I think even a cursory “gut-check” says is dead wrong. When I think back to my greatest teachers, they have little in common. One was cold, condescending, and frankly not a very nice human, but he was exacting in asking us to clearly express ourselves, and his approach led to a room full of students who could clearly state an argument, lead a discussion, and understand the effects of style on philosophical argument. Another was a little scattered, but brought us into his home and family, was passionate about the field, and taught us how important it was to care about our research subjects. Another had a bit of the trickster in him, and would challenge our assumptions by setting absurd situations. And I could name another half-dozen who were excellent teachers–but one of the things that made them excellent was the unique way in which they approached the process of learning.

And frankly, if you asked a number of my undergraduatepeers who the “best” teachers in our program were, there would certainly be some overlap, but it would be far from perfect. An essential question is “best for whom”? And just as our students are each unique, and we should approach them as whole people (the unfortunate fact is that we *do* rank them by grading them, but that doesn’t make the process right), we should approach faculty as… perhaps a box of chocolate. The diversity of backgrounds, styles, and approaches to teaching and learning are a strength, not a weakness. We shouldn’t all be striving to fit to the golden standard of the best among us.

Now, this is not an argument for absolute relativism: there are better and worse ways of fostering student learning. It is also not an argument against quantification or assessment: I think an essential tool for improving our teaching is operationalizing some of the abstruse concepts of “good teaching” to something measurable, and using qualitative AND quantitative assessments to help us develop as a group. But the problem with ranking faculty is that there isn’t a single scale for teaching effectiveness, nor even the three (or four, if you count “hotness”) that RateYourProfessor suggest, but dozens of different scales that we might be ranked on. And while some of us may be near the top of many of those scales, I doubt any of us are at the top of all of them.

Open Analytics and Social Fascination Talk

alex — Sat, 17 Dec 2011 04:48:05 +0000

What makes up a badge?

alex — Thu, 02 Dec 2010 23:23:21 +0000

One of the discussions I was particularly excited about at the Barcelona Drumbeat Festival was using badges to indicate certain skills, abilities, capacities, traits, or accomplishments. The idea here is what you might find in Boy Scout merit badges, or Foursquare badges, or Stack Overflow badges: a quick way to see what a person knows, can do, and identifies themselves with.

As part of my courses in the coming semester, I am abandoning standard grades and instead using badge-level assessments. As part of each course, students can earn any number of badges for demonstrated abilities. These are generally badges that require you to show that you can do something. That ability must be assessed–often by peers.

Starting with the “data” end, what kind of information must a badge hold? We talked through a lot of this in Barcelona, and I’ve been thinking a lot about it since. What appears below shouldn’t be seen as the consensus of that group–though I found the discussion valuable, a number of the items below are certainly not commonly agreed upon among those, e.g., at P2PU who are talking about badges. At a basic level, a badge should be transparent (everything that went into getting the badge should be as visible as possible), and it should be imbued with the authority and reputation of those who were the evaluators.

Process

First, I should briefly describe the process. In the first courses, this process will largely be implemented “manually,” but you will see that there are many opportunities to automate some of these processes.

1. A person is nominated (or nominates themselves) by filling out all of the information on a form except for the endorsements.

2. Endorsers go to the form and indicate whether they feel that the candidate qualifies, for those that require endorsements–some may not. Note that “bots” may act as the endorsers, and check automatically whether something has occurred. In that case they behave just like human endorsers. Note also that the system that records this application should in some way verify the identity of the endorsers. We won’t be do that initially, but eventually, something (e.g., OpenID check) should provide an indication that people are who they say they are.

3. Once the endorsements are complete, a person may put this badge wherever they like on the web, with a link back to the page to show that they have earned the badge.

Nomination / Evidence Form

So, what is on that form? (With * items required.)

1. Name of the badge*

A short description of what the badge signifies: e.g., “Javascript Expert.” If it is a bootstrap badge, this should be clearly indicated in the title: e.g., “Javascript Expert [bootstrap]” (see #9 below).

2. Issuer of the badge*

Eventually, this may be something like “School of Webcraft” or “Quinnipiac University.” For this initial round, it is likely to be “ICM” for the ones I am doing.

3. Version of this badge*

Date-time last updated the badge.

A unique ID for the badge is formed with #1/#2/#3, e.g., Quinnipiac University/Ph.D. in Social Computing/2011-12-25-7:00:00.

4. Badge Image*

For the purposes of standardization, I will say 250x250px PNG representing what the badge stands for.

5. Description

A textual description of what the badge represents. The idea is that it is reasonably brief–say, less than 200 words.

6. Recipient*

Who is it that is claiming the badge.

7. Nominator*

Who is it that nominated this person for the badge?

By default, any badge can be self-nominated. If for some reason you want to exclude this possibility, it could be listed as a requirement in section 9: E.g. “Candidate is nominated by someone other than themselves” or “Candidate may only be nominated by a member of the track team.”

8. When nominated*

Nomination timestamp.

9. Requirements & Evidence

This is the meat of the form. It includes 0 or more requirements, with links to evidence that those requirements were met. Each requirement includes a record like the following:

a) Textual description of the rubric for assessment. What needs to be shown, and how is an evaluator to decide whether it meets the standard. Outside examples may be linked, including former examples of successful badge earners.

b) Textual description or link to the evidence of assessment. (If a link, we’ll probably need to find a way to archive that link for posterity. Easier with some things than with others; e.g., video.)

c) Nominator’s comments on the work and why they think it qualifies.

d) Qualifications to endorse. For example, you might require that people have the badge they are endorsing, or that they have a badge that qualifies them as “instructors” in the skill (e.g., to get the “pilot” badge, you need to be endorsed by at least one person with the “pilot inspektor” badge, or to get a a QU-PhD badge you need endorsements from three people with the QU-Faculty badge). You might also require that people have a badge that verifies their identity. So if I have the Verisignature–ReallyMe badge, maybe it qualifies me to endorse more badges. C is a list of required badges–there may be more than one.

e) Number of qualified endorsers required. This could be zero or a thousand.

f) List of
1. Endorser name
2. Date of endorsement
3. Comments on endorsement

Note that there is a necessary and automatic exception here in the case where there do not exist in the world the number of qualified endorsers listed in D. In that case, you must be endorsed by as many qualified endorsers as currently exist. It is then clearly indicated that the badge is a [BOOTSTRAP]. At some future point you might want to re-try the badge to get a non-bootstrapped version, once there are enough potential endorsers.

10. Issued Date-Time* (or PENDING)

11. Expires Date-Time

12. Recipient’s Comments & Notes

13. List of community comments

An Example Badge Template / Form

Now certain elements of the above are part of the template of a badge. So, if I nominated someone for the “Good Discussion Summarizer” I would end up with a template that included:

“Good Discussion Summarizer”
The Human Fund
1999-8-14-09:00:00

[Some Cool Badge Art that I don’t have time to dummy up at the moment]

The good discussion summarizer is issued to someone who has demonstrated that she is consistently capable of summarizing a brainstorming or other discussion in an academic setting, both verbally and textually.

Recipient:

Nominator: Alex Halavais (2010-12-2-18:55:03)

Badge Requirements, Evidence, and Endorsements

1. Statements from three members of courses in which the recipient is enrolled attesting to her abilities to accurately summarize materials. Endorser must hold the “current student” badge. (No evidence beyond the endorsements required.)

Evidence: (NB: this would be left blank.)

Endorser:
Comments:

Endorser:
Comments

Endorser:
Comments:

2. Evaluation of a video of the candidate reviewing a discussion. Endorser must hold the “Good Discussion Summarizer” badge.
Evidence (Link to video or audio of summarization):
Endorser:
Comments:

3. Evaluation of a textual summary of the same discussion. Endorser must hold the “Good Discussion Summarizer” badge.
Evidence (Link or Pasted Text of a summary):
Endorser:
Comments:

Issued: PENDING
Expires: TBD

Candidate comments:

Community comments:

The nominator would fill out some of these, including, perhaps, being one of the endorsers.

Other Issues

The natural question is how would endorsers know to find the form? There are lots of possibilities here, including informal or direct invitations, and a queue of badge candidates needing assessment. But that is a solution that does not have to exist in the badge process itself (necessarily). The idea is to keep this piece as simple and light as possible.

Happy to hear any thoughts you might have. As I said, I’m going to take it for a test run in the Spring semester. I’ll likely just have people do it manually on the wiki, unless I find time over the break to code a simple form system that can handle the pieces. And I’ll point to the course and the badge description (as well as some of the early badges) as I write them up.

As a final note, this doesn’t in any way take away from the efforts of the Mozilla Badge Backpack approach. Indeed, one of the advantages to that system is that it might provide the opportunity for several dissimilar badge systems to work together. In this case, what would be passed along to the Backpack is just an image of the badge, its name, and a link back to the form that demonstrates how it was earned.