Preliminary SOLO Research Findings

Photo Credit: net_efekt via Compfight cc
Photo Credit: net_efekt via Compfight cc

In this post I will outline my preliminary research findings from my investigation into using SOLO taxonomy as a framework for peer and self assessment. I will provide a brief overview of how the research was carried out before summarising the findings.

In my school we start teaching the science GCSE course (21st century science) at the start of the summer term in year 9. I teach two year 9 classes of roughly the same ability but in different halves of the year group. I was about to start GCSE units with both my classes. This presented the ideal opportunity for my masters research. I decided that I would use one class as the control and teach them as I usually would. The second class would become my experimental group and would be taught the same lessons as the control group but with the emphasise firmly placed on SOLO taxonomy.

The changes I made to the experimental group lessons were to work towards just one overarching learning objective which I renamed the “learning Intention” following the format Steve Martin suggests in his book, “Using SOLO as a framework for teaching”. I then constructed success criteria using the SOLO taxonomy to ensure that they moved towards a deeper level of thinking. The learning intention and success criteria were shared with the experimental class at the start of every lesson and referred back to throughout each lesson. The control group had the lesson title and objectives shared with them at the start of every lesson but they were not structured using SOLO taxonomy.

The first topic to be taught was B1 – you and your genes. Both classes were given the end of topic test as a pre-test. This would tell me what prior knowledge each student had going in to the teaching sequence and allow me to calculate the effect size of the teaching sequence once the post test was complete. There were 12 lessons in the teaching sequence and the students used self assessment throughout the 12 lessons. The experimental group assessed their progress against the success criteria established at the start of each lesson. They were also able to go back and revisit the success criteria when revising for the end of topic exam. The control group assessed their progress against the lesson objectives.

The second topic to be taught was C1 – Air Quality. Again both groups were given the end of topic test as a pre test to establish a baseline. This topic had 12 lessons in the teaching sequence and this time I introduced a form of peer assessment as well as the self assessment they had used previously. Peer assessment took the form of a discussion between learning partners. They had to mark their partners work together, giving verbal feedback explaining why they felt it had met certain success criteria. Only once the marking had been explained and agreed could the feedback be written down on the work. I chose to do the peer feedback in this fashion rather than the traditional swap books and mark because I have found in the past that students with misconceptions will mark another piece of work incorrectly. The discussion of the marking process helps to bring to the fore any misconceptions held and helps to correct them or boost the confidence of those students who are just not sure. The control group used the same peer assessment process but marked against the lesson objectives.

When analysing my results I removed the marks of all the students who did not complete both pre and post tests. I calculated the effect size for each topic by doing the following (taken from I calculated the mean of the pre test scores and the mean of the post test scores. I took the post test mean from the pre test mean and then divided it by the standard deviation. A note here about standard deviation. I decided to use a fixed standard deviation of 2 rather than calculating it from the data. My data sets are so small that an outlier in any set of data could unfairly skew the results. Using a fixed standard deviation based on the total marks available for the test is not perfect but it eliminates the possible issues caused by outliers. If I had a larger sample size then the impact of outliers would be less significant and I could be more confident in the calculated standard deviations. (I calculated the sd thus: the test was out of 20. The average score would be 10. 95% of students would score between 2-18 marks. 18 – 2 = 16. So the estimated standard deviation for the test would be 2).

For the first topic the control group recorded an effect size of 1.63 which seems huge when you consider that an effect size of 1.0 is typically associated with “a two grade leap in GCSE” ( This is compared to an effect size of 2.09 for the experimental group. This is a difference of 0.46, a whole grade of progress. Caution must be applied to these results as they were from a small sample size and the test was out of 20. The results suggest that the use of SOLO taxonomy as a framework for self assessment is a successful one and one that brings larger gains than using self assessment without it.

The second topic with the control group recorded an effect size of 0.24. This was a massive decrease compared to the effect size of the first topic. With the cautionary note about the results in mind I consider three possible reasons for such a drop. The first is that peer assessment was introduced for this second topic and this may be an indication that for some reason the peer assessment was not having a positive impact on the learning. The second reason could be that the students found this topic conceptually more challenging that the first topic. It required them to have a grasp on basic statistics and molecular models which previous students have found difficult. The third, and maybe linked to the second, is that I am a biology specialist and the first topic was the subject of my degree, whereas the second topic is outside of my specialism. Whilst I have been teaching for 13 years, I am more than willing to accept that some lack of knowledge or engagement on the part of my teaching may account for such a low effect size. Let’s now compare this result with that of the experimental group. Remember, the experimental group were also experiencing peer assessment in the same manner, the same topic, and therefore the same bias in teaching from me. The experimental group recorded an effect size of 0.86, a difference of 0.62.

I was very surprised by the results. I expected that SOLO taxonomy would have a positive impact on progress but what I think my results suggest is that SOLO has an even greater impact when the topic presented is more challenging. I suspect it provides the additional structure the students need in order to see what they have already done, what they need to do next and how they are going to get there. Students are even more reliant upon this when they are unsure about the work they are doing. When they are confident in the subject matter, or think they are confident in the subject matter, they are less likely to rely on a structure such as success criteria or maybe not pay as much attention to it.

This marries with the statements students were making in the experimental group. The students started off quite negative about the change. They didn’t see the point in my success criteria and some actively resisted it. As I introduced activities that needed the success criteria to be successful and then showed them how to use the success criteria to revise for their exams, they became more confident with the terminology of SOLO taxonomy and how it could help them to progress. They even began to identify more challenging exam questions based in the SOLO terminology that was being used. One student worried what would happen in September when he got a new teacher who wasn’t going to use SOLO. He said “but how am I going to know how to improve if they don’t use this system!” He was the biggest doubter at the start of the research.

In summary my results suggest that using SOLO as a framework for peer and self assessment can increase the progress of my students, on average, by 1 grade per year and that is an improvement I cannot ignore.

One thought on “Preliminary SOLO Research Findings

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s