Categories
Research Scientific Research

“Success with Style” part 4 — modern data and just a chapter

When starting this analysis I spotted that the download data was for the past 30 days and that this was used for success or fail categorisation. 

Even if the data was for the lifetime of the book, it’s been nearly 5 years since the original downloads. The best way to test this then was to get the latest data (albeit still for the past 30 days).

The other thought was that the analyses looked at the entire book. But what if readers did not read the entire book but only read a certain amount before making a judgment? When submitting work to an agent or publisher for consideration, for example, often only the first chapter is requested. Based on this I analysed just the first 3,000 words of each book through the Penn and LIWC tagger and used its 2013 success/fail data to repeat the experiments.

Finally I noticed a bias towards punctuation as markers for success or failure in the output and ran the experiments without the punctuation tags to see what the result would be.

Starting hypotheses

H0: There's no difference in the tests which produce significant results between the 2014 and 2018 data
HA: There is a difference in the tests which produce significant results between the 2014 and 2018 data

H0: There's no difference in the tests which produce significant results between the full machine analysis of the book and that of just the first 3,000 words
HB: There is a difference in the tests which produce significant results between the full machine analysis of the book and that of just the first 3,000 words

The hypotheses are fairly simple – if there is no difference in the 2018 data then most of the test that proved significant with the 2013 data should also do so in 2018.

Likewise if the first 3,000 words is unimportant the test results should likewise only be significant at the same level.

3,000 words (3k words) is about 10 pages and is about one chapter’s length although of course there is no hard and fast rule about how long a chapter is.

Data used

Data summary

2018 data download date

2018-07-22

2013 data download date

2013-10-23

Unique books used

759

Difference in 2013 and 2018 success rates

Row Labels Count
FAILURE 22
Adventure 5
Detective/mystery 3
Fiction 2
Historical-fiction 1
Love-story 1
Poetry 8
Short-stories 2
SUCCESS 20
Adventure 3
Detective/mystery 4
Fiction 1
Historical-fiction 4
Love-story 3
Sci-fi 5
Grand Total 42

There were 758 unique books (the remaining 42 of the 800 listed were in multiple categories). With 42 differing that is 5.5% of the total books used and none of those with a different success status was listed in multiple categories.

The new data was parsed through both the Perl Lingua Tagger using the Penn treebank and Perl readability measure and the LIWC tagger.

Results for 2013, 2018 and 3,000 word data

Machine learning performance

The most important measure for me is which is the best for making predictions. 

Using all tags including punctuation

Accuracy

95% Confidence Interval

Sensitivity

Specificity

Readablity 2013

65.62%

57.7-72.9%

69%

63%

Readablity 2018

65.00%

57.5-72.8%

68%

63%

Readablity 3k

55.62%

47.6-63.5%

68%

44%

LIWC 2013

75.00%

67.6%-81.5%

76%

74%

LIWC 2018

71.70%

64.0-78.6%

78%

66%

LIWC 3k

56.25%

48.2-64.0%

53%

60%

According to this the LIWC is still the best tagger and that both 2013 and 2018 data are fairly similar for both readability and LIWC, with the results being in each other’s 95% confidence interval.

Both for readability and LIWC the first 3,000 words (3k) are much worse predictors of overall success and barely better than a 50/50 guess.

Difference in significance in key measures

Punctuation

Overall there was not much difference in omitting punctuation for LIWC or Penn analyses. In fact the machine analysis performances all dropped around 5% points. 

Readability 

Genre

Significant 2013

Significant 2018

Significant 3k words

Adventure

TRUE

TRUE

TRUE

Detective/mystery

TRUE

TRUE

TRUE

Fiction

FALSE

FALSE

FALSE

Historical-fiction

FALSE

FALSE

FALSE

Love-story

TRUE

TRUE

TRUE

Poetry

FALSE

FALSE

FALSE

Sci-fi

FALSE

FALSE

FALSE

Short-stories

FALSE

FALSE

FALSE

Significant tags in the same genres for all 3 different categories.

LIWC categories

Test

genre

Significant 2013

Significant 2018

Significant 3k words

Clout

Adventure

TRUE

FALSE

TRUE

 

Detective-mystery

TRUE

TRUE

FALSE

 

Fiction

TRUE

TRUE

FALSE

 

Historical-fiction

FALSE

FALSE

FALSE

 

Love-story

FALSE

FALSE

FALSE

 

Poetry

FALSE

FALSE

FALSE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

FALSE

FALSE

FALSE

         

Authenticity

Adventure

FALSE

FALSE

FALSE

 

Detective-mystery

FALSE

FALSE

FALSE

 

Fiction

TRUE

TRUE

FALSE

 

Historical-fiction

FALSE

FALSE

TRUE

 

Love-story

FALSE

FALSE

FALSE

 

Poetry

TRUE

TRUE

FALSE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

FALSE

FALSE

FALSE

         

Analytical

Adventure

FALSE

FALSE

FALSE

 

Detective-mystery

FALSE

FALSE

FALSE

 

Fiction

TRUE

TRUE

TRUE

 

Historical-fiction

FALSE

FALSE

FALSE

 

Love-story

FALSE

FALSE

TRUE

 

Poetry

FALSE

FALSE

FALSE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

FALSE

FALSE

FALSE

         

6 letter words

Adventure

TRUE

TRUE

TRUE

 

Detective-mystery

FALSE

FALSE

FALSE

 

Fiction

FALSE

FALSE

FALSE

 

Historical-fiction

FALSE

FALSE

FALSE

 

Love-story

TRUE

TRUE

TRUE

 

Poetry

FALSE

FALSE

FALSE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

FALSE

FALSE

FALSE

         

Dictionary words

Adventure

FALSE

FALSE

FALSE

 

Detective-mystery

FALSE

TRUE

TRUE

 

Fiction

TRUE

TRUE

FALSE

 

Historical-fiction

FALSE

FALSE

TRUE

 

Love-story

FALSE

FALSE

TRUE

 

Poetry

FALSE

FALSE

FALSE

 

Sci-fi

TRUE

TRUE

TRUE

 

Short-stories

FALSE

FALSE

FALSE

         

Tone

Adventure

FALSE

FALSE

FALSE

 

Detective-mystery

TRUE

TRUE

TRUE

 

Fiction

TRUE

TRUE

TRUE

 

Historical-fiction

FALSE

FALSE

FALSE

 

Love-story

TRUE

TRUE

FALSE

 

Poetry

TRUE

TRUE

TRUE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

TRUE

TRUE

TRUE

         

Mean words per sentence

Adventure

TRUE

TRUE

TRUE

 

Detective-mystery

FALSE

FALSE

FALSE

 

Fiction

TRUE

TRUE

FALSE

 

Historical-fiction

FALSE

FALSE

FALSE

 

Love-story

FALSE

FALSE

FALSE

 

Poetry

FALSE

FALSE

FALSE

 

Sci-fi

FALSE

FALSE

FALSE

 

Short-stories

FALSE

FALSE

TRUE

Whereas readability was consistent across the different approaches the LIWC categories shows a lot more variety.

Tone has the most success across this. As before the 2013 and 2018 data tend to match (but not always, as with Clout or Dictionary words) and 3,000 words, well, it does its own thing.

Tone most consistent throughout and as last time had most significant categories even with 3k.

Parts of speech tags (PoS) with the largest difference

The tables list the top 3 PoS that dominate in successful and unsuccessful books.

Penn data

Successful PoS 2013 Successful PoS 2018 Successful PoS 3k
INN – Preposition / Conjunction INN – Preposition / Conjunction INN – Preposition / Conjunction
DET – Determiner DET – Determiner DET – Determiner
NNS – Noun, plural NNS – Noun, plural NNS – Noun, plural
     
Unsuccessful PoS 2013 Unsuccessful PoS 2018 Unsuccessful PoS 3k
PRP – Determiner, possessive second PRP – Determiner, possessive second RB – Adverb
RB – Adverb VB – Verb, infinitive PRP – Determiner, possessive second
VB – Verb, infinitive RB – Adverb VB – Verb, infinitive

LIWC data

Successful PoS 2013 Successful PoS 2018 Successful PoS 3k
functional – Total function words  functional – Functional words functional – Total function words 
prep –   Prepositions  prep –   Prepositions  prep –   Prepositions 
article –   Articles  space –   Space  article –   Articles 
     
Unsuccessful PoS 2013 Unsuccessful PoS 2018 Unsuccessful PoS 3k
quote –    Quotation marks  allpunc – All Punctuation* ​ adj –   Common adjectives 
allpunc – All Punctuation* ​ affect – Affective processes  adverb –   Common Adverbs 
affect – Affective processes  posemo –   Positive emotion  affect – Affective processes 

The same tags dominate all the books in the Penn treebank for successful books – prepositions (for, of, although, that), determiners (this, each, some) and plural nouns (women, books).

For unsuccessful books it also has determiners that dominate but in the possessive second person (mine yours), adverbs (often, not, very, here) and infinitive verbs (take, live).

For LIWC it is quite similar. Functional words dominate with (it, to, no, very ), prepositions also dominate successful books (to, with, above is its examples) and articles (a, an, the) and (it, to, no, very).

For unsuccessful books it’s all punctuation, quotation marks and social (mate, talk, they while including all family references) and affective processes (happy, cried), which includes all emotional terms.

Quotations suggest a high propensity to a high ratio of dialogue to action/description.

What does this tell us?

2013 v 2018 data

Overall there is more similarity than difference in the 2013 and 2018 Penn and readability results. The machine learning performance was also broadly the same, with each other’s overall performance falling within the 95% confidence interval.  

The most successful PoS were also largely the same, as were the top 3 unsuccessful ones.

Likewise the LIWC categories generally matched in significance for both 2013 and 2018 data. The Successful PoS were broadly the same, as were the unsuccessful ones.

This suggests that while the original authors didn’t mention that the data was only from the previous 30 days, their results have largely stood to be true.

The first chapter

Just judging a book by its first 3,000 words was not as accurate as analysing the whole book. The machine learning performance was barely better than a guess. 

However, the readability did match and the dominance of  successful PoS was similar to that of the full data in the 2013 and 2018 studies.

Of all the LIWC categories described in part 3, Tone both was the most significant predictor across genres but also the most consistent across the different tests.

Summary

The 2018 results generally matches the 2013 results and as such suggest the original method holds as a good predictor of success or failure of those books.

The first 3,000 words results did not match the 2013 or 2018 data and as its machine learning performance was the weakest suggests that this is not an accurate way to predict a book’s success. It may be that there is a ‘sweet spot’ where the first x amount of words correlates closely with the overall rating, but it is more than 3,000 words.

Successful books tend to use prepositions, determiner and nouns and functional words. Unsuccessful ones skew towards quotations marks, punctuation and positive emotions (which with the LIWC are similar to affective processes).

This suggests that unsuccessful books may use shorter sentences (high punctuation rate), more dialogue (high quotation mark rate), adverbs and are more emotional, particularly positive emotions. Writers are frequently told by writing experts to avoid adverbs wherever possible.

Successful books by contrast tend to focus on the action – describing scenes and situations, hence the dominance of functional words, prepositions and articles. This makes them sound rather boring, but suggests that these bread and butter words are necessary to build a good story.

The LIWC data suggests that tone is the most reliable predictor of success. But what isn’t answered whether it is because it predominates in successful or unsuccessful books and whether it is positive or negative emotions. This is something to explore though based on the emotion and affect appearing in the top 3 of unsuccessful books suggests it is there.

Having punctuation tags had some use and machine learning performance was better with it so even though the punctuation tags can be hard to interpret, it is worth including them in any machine analysis but more work is needed to interpret them.

Categories
About Writing

The 5Ws and whodunnits: a Chinatown character exercise

A whodunnit is a genre of film where, as the name suggests, we want to find out who did it. But knowing who is not just enough, we also want to know the why, where, when, what and how.

In journalism these points are generally known as the 5Ws (even though ‘how’ makes it 5W1H) and are seen as the fundamental information a news story needs to convey. This information should appear in a story as early as possible.

5Ws and whodunnits

The whos, whats, wheres, whens, whys and hows

“I keep six honest serving-men, (They taught me all I knew);
Their names are What and Why and When, And How and Where and Who.” – Rudyard Kipling

Naturally in a whodunnit we don’t want to know the answers too early or it spoils the fun, with the honourable exception of Colombo which made a point of showing all this at the start. Even then it became a mystery of how the rumpled detective would uncover the information.

What makes the television show Colombo with its one big crime to solve different to a mystery film such as Chinatown is that while both involve an investigation, in movies the general principle is that there are multiple mysteries. Act I typically has a smaller, more pedestrian mystery that leads into the bigger one, with Act III sometimes having its own mystery resolved, often one that stemmed from the Act I mystery but was not directly investigated at first.

As an exercise I did some work on looking at the mysteries within the classic 1974 film Chinatown written by Robert Towne and looked at the 5Ws for each of them. Spoilers follow, naturally.

Mysteries in Chinatown

In each of these not only have I labelled each mystery and the initial ‘answer’ to each of the 6Ws, I’ve iterated as to why the answer is what it is. Most end with a character motivation, or that it’s part of the setting.

The mysteries are listed in no particular order, and the act marks are where it’s first raised but not necessarily where it’s solved.

Mystery 1/Act I mystery – Is Mulwray having an affair? (The false mystery)

Is Mulwray (Evelyn’s husband) having an affair (False mystery)?

Answer/Why 1

Why 2

Why 3

Why 4

Who?

Mulwray, head of the Department of Water and Power

He’s a rich guy

Gittes, the private investigator, assumes pretty girls are mistressess

[It’s in his character to assume powerful men have affairs, he’s seen it before]

What?

Attractive young girl

She’s pretty

Gittes assumes pretty girls are mistressess

[It’s in his character]

Where?

In a house Mulwray owns in Echo Park, LA

Away from the Mulwrays’ posh home

Gittes has seen this before

[It’s in his character]

When?

During work hours

So his wife doesn’t know

Gittes has seen this before

[Backstory]

Why?

Who cares

Jake doesn’t care about the whys in these case

 He’s in it for the money, he’s a professional  

How?

He visits the love nest

Away from the wife

Gittes knew where to look as he has seen this before

 

Mystery 2/Act II – Who is stealing LA’s water?

Who is stealing the water?

Answer/Why 1

Why 2

Why 3

Why 4

Why 5

Who?

Noah Cross, a rich industrialist

He owns land that needs water

It’s desert and worthless without water

He bought it on the cheap knowing he could get water

[It’s his character to get what he wants regardless of the ethics]

What?

Water is being diverted during a drought

Noah is a powerful man and can do this

He is extremely wealthy

[It’s his character]

 

Where?

To land Noah Cross, a rich industrialist, has bought

He wants it watered

To increase its value

It’s desert and worthless without water

LA is a desert [It’s the setting]

When?

At night

So no one will see

As it’s theft

Water is precious in desert LA

Noah is rich but not so rich he can do whatever he wants [Conflict with used to getting what he wants]

Why?

To store water in land to make it more valuable

The land is dry

Noah wants to do it secretly

So that only he will benefit

He’s a greedy man [It’s his character]

How?

Diverting water through channels

They were built there

[It’s the setting]

   

Mystery 3/Act IIb – Who set Gittes up?

Who set Gittes up?

Answer/Why 1

Why 2

Why 3

Why 4

Why 5

Who?

Noah Cross

Mulwray was blocking his plans to build a dam

The water would go to desert land and not benefit the citizens of LA

Mulwray believes water belongs to the people and can’t be corrupted

[It’s in his character]

What?

Hired an actress to hire Jack Gittes to investigate ‘her’ husband

Gittes would believe her as the real Mrs Mulwray (Evelyn) wouldn’t do it

Mrs Mulwray loved her husband

[It’s in her character to love this father figure]

 

Where?

At Gittes office

It’s a city built in the desert

Water is a precious commodity here

[It’s the setting]

 

When?

While Mulwray was seeing Evelyn and was head of the Department of Water and Power

Cross knew what it would look like

Mulwray could not reveal the truth of who the girl is, she is tainted

She is a product of incest

He won’t hurt her [It’s in his character to be good]

Why?

Cross wanted to blackmail Mulwray

So the dam will get built

The worthless land he bought will be worth millions

He wants a legacy

[It’s in his character to desire his name living on after him]

How?

By having Gittes take photos of Mulwray with a girl and a ‘love nest’

Gittes is a well known PI so his evidence is credible

He will do what it takes to get the evidence

[It’s in his character]

 

Mystery 4/Act I and Act II – Who is Mulwray’s mistress?

Who is Mulwray’s mistress?

Answer/Why 1

Why 2

Why 3

Why 4

Who?

Katherine Cross

His wife’s daughter/sister

Incest in the past when Noah’s wife died

Noah can do what he want [It’s in his character]

What?

Mulwray’s step-daughter (and sister-in-law)

He wants to protect and look after her

He’s a good guy

[It’s in his character to be good]

Where?

In a house Mulwray owns in Echo Park, LA

He wants to keep her away from Evelyn

He doesn’t feel Evelyn should have contact

Because the daughter is a product of incest

When?

Since he married Evelyn

She’s her daughter

He feels responsible

[It’s in his character to do the right thing]

Why?

He wants to raise her well and keep Evelyn out of it

He wants Katherine to have a normal life

He doesn’t want her to know her past

It’s disturbing to know you’re a product of incest and he wants to protect her from it [It’s in his character]

How?

He visits when he can and takes her places

He wants her to have a normal enough life

He puts value on people living well

[It’s in his character]

#### Mystery 5/Act II – Who killed Mulwray?

Who killed Mulwray?

Answer/Why 1

Why 2

Why 3

Why 4

Why 5

Why 6

Who?

Noah Cross (Evelyn’s father)

They were meeting to discuss the future

Mulwray has power to stop Cross’ plans

Mulwray is head of DWP

Mulwray sold his and Cross’ private water company to LA

Mulwray believes that water belongs to the people

[It’s in his character – he believes in ‘the people’]

What?

Drowned Mulwray

Heat of the moment

He became angry at Mulwray’s refusal

Without water the land is worthless so his investment would be a waste

Noah gets furious if he doesn’t get his way

[It’s in his character to react angrily to refusals to his will]

Where?

In his own pond of saltwater

They were meeting at Mulwray’s house

Cross wanted somewhere private to discuss his plans

His plans are dodgy and involve defrauding LA voters

Cross is prepared to make dodgy deals to get his way

[It’s in his character]

When?

After Mulwray said he’d reveal his plans to buy land on the cheap and divert water there

There’s a referendum soon on whether to build the dam

Mulwray publicly opposes it as head of the DWP

As head of the DWP and builder of another dam his voice carries weight

[It’s in his character to do the right thing]

 

Why?

To allow a damn to be built

It will provide water to a dry valley

Cross has been buying land on the cheap in the dry valley

He wants money and a legacy

[It’s in his character]

 

How?

Cross pushed him in

Heat of the moment, Cross did not plan to kill Mulwary

Cross has a short temper

[It’s in his character]

   

Analysis of the analysis

When I started this exercise I didn’t plan to end each iteration on an answer. But with each mystery I broke down it seemed to naturally flow from the character or setting.

What is also satisfying is the number of iterations Chinatown offered for each mystery and its consistent character motivations. There wasn’t a straightforward answer to any of them, each took multiple iterations, and this fits in with master of mystery Raymond Chandler’s ‘Ten Commandments for the Detective Novel’; that we are honest with the reader (or viewer) and have given them the information through the 6Ws needed to make the inevitable once revealled.

Partly this may be because I am over analysing and making it seem more complicated than it is. But I’ve sat on this analysis for a while and having returned there is something satisfying with this approach.

The false mystery found

As stated earlier, Act I generally has the false mystery, one that segues into the larger mystery. It is also not what the film is known for.

Compare this with the Chinatown-inspired Who Framed Roger Rabbit. Private investigator Eddie Valiant’s Act I ‘puzzle’ is similar in that it’s notionally about him finding out about an affair but in reality this is staged setup for something bigger.

Neither protagonist is asked to investigate the film’s larger mystery — who killed an important man and the land-grab conspiracy behind it — that emerges in Act II but chooses to do so. Along the way they also stumble inadvertently into revealing answers to deeply personal puzzles they didn’t even want to know — that Evelyn is mother to her own sister, that Roger Rabbit‘s Judge Doom is a Toon.

Applying the 6Ws to characters and plot

I’m not pretending that this tool can be used to plot mysteries. But it can be used to sense check what you have written: first that you answer all the 5Ws that a sharp viewer will want to know (apart from JJ Abrams and his incomprehensible passion for Mystery Boxes); second that your answers have some depth beyond “it just is”.

As mentioned, there is a danger of over analysis – if you’re smart enough you can spin out anything. But with honest evaluation it may help as a tool to look for depth of mystery and consistency of character across the story.

It may be that this applies beyond mysteries, thrillers, whodunnits and the like but it seems an obvious start. In theory any protagonist and antagonist’s motives can be analysed this way too, and may be a way to check that a villain’s goals really are beyond ‘because he is evil’.

That’s for another analysis.

Categories
About Writing Original research

A better way of writing? Part 1: current problems

There is a problem with creative writing, and it’s an old one. It takes a lot of talent and energy to write a novel or screenplay, yet only a very few individuals have that combination of great story, great writing and a little luck to see it published or produced to widespread acclaim.

There are many reasons why so few people make a career as a writer. It takes time and commitment to a gripping idea and then the skill to write it in an engaging way. Even writing a great book or script is no guarantee a publisher, agent or studio will pick it up, due to market forces, personality clashes, bad luck or events.

Most writers write alone, they review alone, and beyond a small group of friends and family, and most fail alone. But what if they could have had help for others of equal talent to make their good story a great one?

via Drew Coffman

Writing quality and quantity

Very few people like to criticise others (although a few people do make a living from this kind of writing). Other than, usually effervescent, friends and family, the unpublished writer will get their criticism from writing groups.

I’ve been to multiple writing groups and the standard has been pretty good but not great (my own work included). That’s not to say there aren’t many clever people, with very good work, but it’s not been with that quality that is needed to make it to sales.

Writing groups can help with feedback to improve quality, and a good writing group will offer more feedback than “I really liked this line” and zero in on problems with the dialogue, pace, story and writing. Unfortunately, while the group is generally good at spotting the problem it’s bad at offering the right solution.

Writers may not take on board much feedback, dismissing it as the comments of amateurs. To act on feedback you generally need to hear it from others you regard as your peers or superiors.

Writers’ rooms

I recently went on a training session where we split into teams to complete tasks. Our team ‘won’ in that we completed the most tasks, but the secret was we should have worked with the other teams to complete all tasks because in the exercise we shared a ‘boss’. In writing our boss is the reader and as writers many of us are working solitary to complete a pretty good story rather than coming together for a completely satisfying story.

There are exceptions. In the US writers’ rooms, groups of writers working on a screenplay, are common for TV shows such as The Simpsons, Narcos and other top programmes. A writers’ room can lead to consistency over a series, draws together ideas, improves standards and enables a script to make it to read through and production fairly quickly.

So why aren’t they more common beyond US TV shows, why not in film or for novels, and why are they rare in the UK? Doctors on Radio 4 and Doctor Who are pretty much the only British writers’ rooms (the BBC Writers Room is the corporation’s submission site).

One reason is that UK TV series are shorter, typically 6 episodes, so one or two people can write it. US shows can be a run of 20 episodes in a season. Yet there are other places in the UK where writing is produced by a team to tight deadlines.

Just not in creative writing.

Content teams and writers’ rooms

I’ve written and led content teams to produce content for GOV.UK. Unlike a writing team, the content team’s output isn’t creative but the process to get there is: taking bureaucratic, legalistic documents and translating them into a language an audience not just understands but needs to understand (such as they need to get a passport or pay a fine) requires a lot of creative thinking.

So what makes a ‘content team’ different to a “writers’ room”? The secret sauce here is that a content team is multidisciplinary and Agile.

Using Agile delivery – breaking down projects into tasks assigned to individuals and agreed by a team as a whole to be delivered in a set time – and writing to a clear style (both for English and approach to work) focuses content designers.

Why content designers rather than writers? One reason is that writers were seen as rather servile, black boxes where someone sent a document to have its spelling and worst sentences corrected. Content designers do that too but have more power to shape the content; its structure, language (particularly that it meets the style guide), and can even reject the proposal.

Good writers do this too, but content designers probably work in a system more akin to that for software delivery rather than creative writing – in my experience in the media and publishing writing was not delivered this Agile way.

But is this process suitable for creative writing and can it help this push to greatness, or is the problem too great to solve with just one tool?

The writing problem – and a solution?

The problem with writing then is that writers aren’t being pushed enough. Some are of the calibre to push themselves, but for the majority the discipline and effort needed for the final push from good into great is too much.

Next time: how Agile methods can be used to achieve this push.

Categories
News Scientific Research

Scrivener: the best tool for organising user research

User research involves a lot of, well, research; a lot of notes, documents, videos, pictures, post its and more. And they all need organising.

There’s no one solution for the problem of what to do with all this, but after a bit of experimentation I find that using Scrivener has been the best for me for keeping things organised.

Scrivener is often seen as a writing tool, but it’s more than a word processor. Yes, it is a writing tool – from word processing to screenplays – but it is also an organiser. Most important it’s very simple to use, and has more advance features for those who want them.

Scrivener being used for user research
Scrivener lets you display folders and multiple documents at once

Renaming research in Scrivener

I’ve been using Scrivener for years, and coming from an anthropological and journalist background to user research I focus research that’s written up – observations, interviews, transcripts. But I also add photos, plan card sorts, organise thoughts with the card index display, and add spreadsheets, PDFs and presentations. Even if I don’t read the presentations directly in there, being able to search all relevant work in one search helps.

In Scrivener I like how easy it is to organise and rename documents, or duplicate them. Compared with doing this in Finder or Explorer, it is much less of a faff. Likewise documents open immediately rather than take a few seconds in Word or Google Drive (and often aren’t the one I want anyway).

While I still use Google Drive and Dropbox and to organise files, particularly video, due to the amount of research that is pure words, either as transcripts, proposals, documents or insights, I find that Scrivener is the best way to keep it all together.

Tables

I love tables. I like maths, I like spreadsheets. Really.

I like to organise interview questions in tables and use a Dewey-esque numbering system to help reorganise them. So question 101 is the first, but perhaps it needs to come later, so I reorganise it as 103 and sort.

Likewise when reviewing a transcript I like to have each question in its own cell with thoughts and insights in the cell next to it.

Scrivener could be friendlier with tables – don’t create one at the end of a page or you’ll never get out, and I always have to customise it. But once I created a good, blank table I could copy and paste that.

Sort code Quote Observation
101 I’m not really sure that it’s appropriate User not keen on this
102 Do I really have to give you a dummy quote? Prefers to be in control of speech
250 At this time, a friend shall lose his friend’s hammer and the young shall not know where lieth the things possessed by their fathers Likes Brian?

Good things about using Scrivener for user research

What’s great:

  • Easy to move documents around and organise into folders and rename them
  • Split view makes reviewing transcripts and images easy
  • Colour and icon coding makes it easy to find key files
  • Compiling documents means you can make it consistent output, or just select the ones you need to put into a single PDF or Word report, or output as multiple documents so you don’t have to worry about formatting until the end
  • Coding for things such as image captions means that you don’t have problems with Word getting confused about auto-numbers
  • Text file syncing – if out in the field you can create text notes and sync them automatically into the project 
  • Great search tool for searching titles or entire files
  • Corkboard views to organise thoughts, observations, insights etc
  • Good way to have a list of priorities and hierarchies
  • Importing documents automatically works pretty well, just drag and drop the Word docs to where you want them and it’ll convert them into a continuous webpage rather than multi page report

What’s not so great:

  • No dictation tool
  • Not always the best way to view documents and tables
  • No Android version, although there is one for iOS, although it’s rare that you need the entire project on ⁃ your phone
  • Adding weblink – it already fills in the https:// part but every time you copy and paste from Chrome it has that part, so you get ‘broken’ links as it’s https://https:// if you forget to remove that part
  • Can be fiddly with bullets

User research tools to support Scrivener

OneNote, which isn’t free, is good for:

  • Transcripts – jump to the audio where your notes are as it tracks your writing with recording (although only 15min recording on Android for some unknown reason). It can convert speech to text, though I find that’s a bit less reliable.
  • Optical character recognition – it’s not 100% accurate but it’s good enough for recognising text from images and these will be show in search
  • Syncs across devices

I also use Trello to track research questions, answers and insights.

Overall Scrivener with its files synced through the cloud (Dropbox, OneDrive etc) has been great for keeping track of research. Scrivener isn’t free, but I feel I got my $45 worth of use long ago, and it’s less than what Microsoft charges for Office 365 (which includes OneNote).

Scrivener hasn’t sponsored or otherwise provided incentives for me to write this (nor has Microsoft, though I’d feel weird if they did), I just want to spread the word for a useful tool.

Categories
About Writing

The user researcher and the screenwriter

Screenwriting is getting a story onto paper which is then made into a film. User research is understanding how users behave when trying to complete a task or service, typically online. 

Putting it like that there may not seem to be too much similarity between the two, but explore further and I believe that they share the same goal – of documenting the human condition and producing a ‘truth’ within parameters.

Are there differences? Of course, but the similarities are that I find interesting.

User research
Time to interview – Ethan via Flickr

A history of two crafts

Screenwriting has been a craft for over a century, user research, in its current form, has only been embraced by governments on a widescale over the past few years. Being so new and flexible it does give me some room to manoeuvre, but in general the similarities are:

  • a quest for a truth, in defined parameters
  • a following of principles over rules
  • the aim of recording how people actually speak over how we think they should
  • show over tell

But what of market research? Well it’s similar but different to user research. Market research is about finding out users but is a more analytical approach, focusing on breadth often at the expense of depth. Government user research isn’t concerned so much about segmentation, weightings and the like (though they are not ignored). It’s about reaching the goal.

Screenwritng is similar — there are no rules, or if there are, there are too many exceptions. All that matters is writing a story that works.

A quest for truth

In very general terms, films aren’t necessarily about a truth – Superman has not saved the planet, the Inglourious Basterds didn’t kill the Third Reich’s ringleaders, Withnail never existed let alone acted.

But within their own universe, that created for the film, they must stay true to the rules created if they are to succeed. Superman can do almost anything, but even he must stay true to his rules — he will still ‘do good’ whatever is thrown at him.

The Inglourious Basterds burned Hitler and his cronies because to director Quentin Tarantino, that’s what worked in his story that included a glorified ‘kill the enemy film’, albeit from the German’s perspective.

Withnail may not have existed, but the relationships, tensions and ambitions Withnail & I explored are true enough in our world because it is set in the same rules as our universe.

And so on… So what am I getting at? You define the goals, you set parameters, and you stick to them if you want to succeed.

Individual approach

‘”There are no rules to follow, Donald, and
anybody who says there are, is just –”
“Not rules, principles.”‘ — The Kaufman twins, Adaptation.

Many crafts have principles rather than f rules. But user research is still fairly new in government and to a large extent it is still down to the individual or small team carrying out the user research to get to the goal. As such it is still down to the individual who does it.

This is reflected in that very few user researchers I’ve worked with have specialised in this for their careers. Instead they’ve come from a variety of backgrounds, and for myself it’s been content, journalism and anthropology.

It’s down to the individual.

Show, not tell

It’s rare for a film to succeed where all the characters do is tell you how good they are. In fact the audience largely forgives what we’re told about them if we see them doing wonderful things (ask Indiana Jones just hold old Marion Ravenwood was when he seduced her).

User research is about showing what is found — at show and tells, in videos of interviews, of producing quotes and examples. Don’t just tell us what is found, show it, and be consistent.

Getting to the heart of people

Ultimately screenwriting and user research and have one key goal – to show us that this is what life is really like. It is to produce something that is recognisable.

User research is like that. Taking something and passing it on to the next stage of the process. Looking for recognition that yes, this is what reality is and what we need to produce. You write down what people say, not what you think they say, and arrange it to make sense.

You also look for plot holes and inconsistencies and how to get rid of them, whether that’s more user research or revising the screenwriting.

A team sport

Finally it’s about others. User researchers don’t work alone, you’re encouraged to show, not tell, key parts of your process. You do not work around, your work forms the foundation of all that follows.

No script, no film; no user research, no project.

Ultimately it’s about getting a truth. The truth here but within the goal of getting a truth. Not the truth, but whatever will meet the goal of truth. And one that at the end, whether it is the final show-and-tell or handover, or a film, will leave the audience satisfied that they saw something true to what was set out.

Categories
Scientific Research Writing

Scraping, screenplays and sexism

In the past couple of days there have been two big data posts that analyses sex and screenplays.

Polygraph’s Hannah Anderson and Matt Daniels scraped and analysed 2,000 screenplays and their dialogue to get data on the division of dialogue according to sex, age and other factors.

The Economist looked at data from USC Annenberg on nudity and ‘sexualised attire’ (aka revealing outfits and the like) in film, along with lead and speaking roles by sex.

script-analysed

Getting screenplay data

Both reports focused on presenting the data and key thoughts rather than delving too deep into interpretation. Analysing Hollywood is a complex business – like William Goldman said “nobody knows anything” when it comes to predicting success, let alone Hollywood and sexism.

The main thing of interest for me is the methods of analysing screenplays. Matt has a long and detailed method with links to script sources, along with the code on Github and a list of where he got the data from.

Potential uses

Both studies used data to explore issues around gender and films, but there is further potential with the data. For example:

  • emotion and sentiment – not a fan due to the drawbacks but possible to trace emotion in scripts, looking at such things as whether beginning, middle or ends are more or less emotional and is there a pattern
  • the split of action and dialogue in a script – do successful scripts have a divide (aka an avoidance of walls of text)
  • are women more confident or not – an extension of their sexism report, but it could be a question of whether female characters tend to ask more characters (or use emotional language)
  • writing level – what is the typical readability for the dialogue of heroes and villains, along with scripts in general and how does this vary by genre (would The Imitation Game or A Beautiful Mind be more difficult to read, let alone film, than Die Hard?)
  • is good writing important in a successful script – as with the study of readability, does having too many adverbs and other things that Hemingway hates hinder scripts
  • statistical significance – as Matt acknowledges, there are no statistical tests in their report, what tests could be done

Why we need this data

Maybe nothing comes out, but there is no harm in trying and while I never expect any rules to come out (Goldman is already laughing) but perhaps some very broad principles could emerge from the data. Even a finding of nothing can be something to report. The only pity is that due to grey areas of scraping we’d have to start from scratch rather than use the script data the teams have already used.

But it will be worth it and we can get away from what the Polygraph article calls “all rhetoric and no data, which gets us nowhere in terms of having an informed discussion.”

In the meantime if you want to search the data you can either check out the links or use the Polygraph tool here.

Categories
About Writing

Reviews of reviews, reviewed

I’m very pleased to have passed my data analysis and statistical inference course. It’s just a shame that it reveals that a previous post is wrong.

First, why I’ve written this. It’s about statistical inference. To give a very short, very simplified view, statistical inference is a way of making predictions about larger sets of data from a few samples.

Say we wanted to work out the probability of a drug testing giving a false positive, or the proportion of university educated men who think a woman’s place is solely in the home. You could test or ask everyone, or you could run it past a few and make a prediction – or infer – the wider picture.

That’s what I tried to do with a script analysis. Unfortunately in my enthusiasm for the course, I did something very wrong – I went with what seemed to work rather than what I could prove.

I wanted to analyse scripts to make an inference. Excel has statistical tools. It seemed like a simple case of I feed the data in and get an answer out. Except only one gave a good answer. But it was called ‘correlation test’, so surely it meant it would show how things matched, or ‘correlated’?

Not really. Now I’ve worked with R, a maths program better suited for these tests than Excel, and passed my course with distinction (I earned it, I’m going to brag) I know now what I should have done.

I used too small a sample set and tests were too arbitrary. I could get away with a small test, or even arbitrary, but I used too many poor techniques. So I won’t repeat the experiment.

But it’s a good start. More importantly it’s shown me what can be analysed and that’s worth it, I’ll be starting that analysis over Christmas.

And if you do want to learn for yourself I can only recommend the Duke University online course via Coursera. You can sign up yourself for the next data analysis and statistical inference course.

Categories
Scientific Research

Season finales: which shows went out in style?

Season finales, the last show in a series, the end of an era… when a TV programme comes to an end (or season, depending on where you are) there’s a high expectation the writers will make it a classic.

This isn’t always the case. The Sopranos became notorious for its unclear ending of whether the main character, Tony Soprano, died or not. On the other hand, Breaking Bad‘s ending, which resolved the fate of Walter White, Jessie Pinkman and the others, won rave reviews.

Best season endings

The reason I used those two examples is that both The Sopranos and Breaking Bad were generally and consistently well-reviewed, so the endings had a high expectation of being of equally good (and ideally better) quality. Yet how do they compare to other series?

Two Reddit users, PhJulien and ChallengeResponse,  have done something clever I wish I thought of — getting the data from IMDB and comparing finales with average ratings. IMDB not only lists every episodes but also collects user ratings. More importantly, it lets you get at its data.

Here’s what they found.

Finales that topped the series

Series finales that topped or bombed - via /u/ChallengeResponse/Imgur
Series finales that topped or bombed (click image for full size) – via /u/ChallengeResponse/Imgur

What ChallengeResponse did was write a Python program to get the data and then made a chart ranking the difference between the average rating and the finale rating. He’s ranked this by the biggest difference, so that Glee, the school where the singing never stops, which got around a 6.8 average, had a finale with 9.2. I’m reading the charts for these numbers so may be off, but that’s a difference of 2.4 rating points.

At the other end, Dexter, the show about the serial-killer killer, caused a stink with viewers, dropping from its average of 8.9 out of 10 to 4.8 in the finale, a drop of 4.1 rating points.

Another, earlier, way to look at this is through PhJulien’s chart, which scatters average rating to finale.

Series average ratings plotted against finale rating
Series average ratings plotted against finale rating – via /u/PhJulien/Imgur

Looking at it this way, Breaking Bad, which had an extremely good average of 9.0 for its series as a whole, went out with a 9.9. So a good show went out almost perfectly, according to public rating the show on IMDB.

Looking at it this way the majority of shows go out roughly a little better than average (which is what viewers want).

Would this work with British TV?

No British show is in PhJulien’s chart, and only one in ChallengeResponse’s data – The Office (its US version is in PhJulien’s).

Could I repeat this? Yes, but the difference is that US shows offer a much bigger sample size — the US version of The Office ran to 201 episodes, the UK version to just 12 and 3 specials.

When you’re basing data on such small samples it gets a bit trickier, not least because the average for the finale is included in the series’ overall rating. That’s not a problem when the final episode is 1 out of 201, or 0.5% of all episodes and ratings, while the final of the UK version accounts for 7% of all ratings.

Could I try this? Yes, but I think the findings are too risky. Still, it’s a great idea and one that could be used in other data reviews.

Do it yourself

You can get all ChallengeResponses charts and more (ranked by finale, season average and alphabetically) at Imgur.

He also includes the links to doing it yourself by using IMDbPY and how he visualised it in iPython using matplotlib.

You can get the source code for iPython notebook on GitHub.

Categories
Scientific Research

Explaining the news: is Vox top?

There are thousands of news sites out there. But what if there was a way to find out which site is best for giving you a good overview of news stories.

I’ve analysed newspapers before, but this is different. At the recent News Impact Summit (NIS), I heard an interesting talk by the engagement manager for Vox, a newish online news site. But a couple of things seemed off with what we were told.

Vox’s spokeswoman told us that her site’s goal is to set itself apart from other news sites by explaining the news, and to do so in a shareable way. She showed examples of ‘cards’ (Vox’s way of displaying content explaining the news), but I couldn’t help but have a problem with their examples of ‘easy-to-understand’ content. With only 6 sentences shown on screen, at least 2 of them were longer than 30 words*.

At GOV.UK, where I’m currently freelancing, we wouldn’t have that. No not at all, for user research shows that anything over 25 words is a reading killer. Similarly, we’re told to avoid unnecessary or uncommon words, such as the “hence” that started sentences in the Vox examples.

On the other hand, Vox’s spokeswoman told us it put tremendous effort into polishing headlines to make more readers want to click.

Me being me, this prompted me to wonder what the truth is.

Man reading a website as a paper

Skip a section

If you want to skip the method and background you can go to:

Comparing the news

I selected several of the top news stories of the past year, ones that Vox had a ‘card’ for:

  • the ebola outbreak
  • Islamic State
  • Malaysian Airways MH17 downing over Ukraine
  • the Ukraine crisis
  • Michael Brown shooting and rioting in Ferguson, USA

I wanted to look at comparable news sources. This doesn’t just mean news sites. I looked at a combination of the most popular news sites in the world (that I could access without subscription), along with other ways we get news. So even though BuzzFeed and Reddit aren’t in the top 10 news sites, they are significant news source for many. I then divided these into new and old media.

‘Old media’ (organisations established before the internet):

  • BBC News — the UK’s most popular news site. Most of its articles are written by its own journalists
  • The Daily Mail — the world’s most popular news site and, unlike the New York Times, I can access its articles. It uses Associated Press articles along with its own
  • The Guardian — another globally popular website but one that aims to be a bit more highbrow than the Mail. Has many guest authors
  • The Economist — though not as popular, like Vox it seeks to explain the news and not just report it. No author bylines, all articles conform to one style

‘New media’ (organisations set up since the internet became popular):

  • Huffington Post — like Vox, this is a ‘new media’ site and very popular. It too has a range of guest authors
  • BuzzFeed — journalists love to spoof its hyperbolic headlines, but it’s increasingly popular, particularly on Facebook, and its UK editor was interesting at the NIS
  • Reddit — a social site with a range of topics. I looked at its ‘Explain like I’m 5’ sub-reddit (thread) for ‘simplified and layman-accessible explanations’
  • Vox — US news site that features both news stories and more in-depth explanations through its ‘cards’

Not every site had a good summary or explainer, while some had more than 1. You can see the full list of articles here.

Using various analytic tools, including readability analysis programs, word counts, my own splitting, and the LIWC word analysis tool, I ran the articles through several analyses.

What I expected to find and why

Vox says it spends a lot of effort perfecting the headline. Good, for I found in previous research that a good headline — descriptive, inviting, optimised — is vital for getting readers to click.

However, nothing was mentioned of polishing Vox’s content. To be fair to the speaker, she wasn’t a writer, so she may not have had the information. Yet this meant my expectation was that the headlines would be polished but the content could ramble (and not be readable).

As for the other sources… The Guardian is a ‘high brow’ paper so would probably be the least readable of the major sources. The Economist is also high brow but it takes the view that authors should never assume too much prior knowledge of its readers. As a subscriber I listen to its audio edition and the language flows. Like the BBC then, as a media firm that has a ‘spoken word service’ (so to speak) this helps focus on good readability.

The Daily Mail, however, is so popular that it must appeal to the lowest common denominator — easy reading. The Huffington Post was my main uncertainty — I don’t read it, and going by my social networks, no one else seems to (at least in the UK). But a quick looks shows that it has a lot of authors and no set tone.

Finally there are two of the newest sites — Reddit and BuzzFeed. BuzzFeed is a joke to many journalists (sorry BuzzFeed staff reading this). But at the NIS, the site more (in)famous for headlines like “Can You Make It Through This Post Without Feeling Sexually Attracted to Food? ” and “Emoji Facts That Will Make You 🙂” and its ilk seemed to be getting the last laugh. Its UK news editor got the respect, if grudging, from the more senior hacks there.

In part it’s because BuzzFeed is going beyond cat pictures to do more serious reports. Readers are coming for the memes but staying for the news.

Reddit is slightly different to all the others on this source. It’s a glorified messageboard — anyone can ask a question, anyone can answer. Other users can vote on questions and answers, and as it’s so popular it has a wide range of users, from experts to the average internet commenter. Thanks to the voting of the ‘best’ queries and answers I’ve often found good, clear explanations that go beyond the news article it’s linked to. In particular, the Explain Like I’m 5 sub-Reddit (thread) is dedicated to explaining complex issues (not just the news) and ideas in simple ways.

Results

Data processing

Headlines

Headline complexity

A good headline will give enough detail to describe, but leave enough out to make the reader want to find out more. Today’s readers are presented with so many headlines on a news site’s homepage, let alone their social and other sites, that it’s vital that headlines stand out. One way of doing that is making sure they actually understand (or have a good guess) of what the headline will link to.

I couldn’t measure whether something was clickbait (ie, content doesn’t match the title), and I find headlines are too short to run a readability analysis. Instead I found complexity of words as a good proxy, where a ‘complex word’ is any with 3 or more syllables. In other words, long words.

Though not perfect, it does give us an idea of how snappy a headline is. I didn’t look at length because in this day of search engine optimisation, and on my previous research, I didn’t find a good correlation between clicks and length.

A good headline then should be long enough to capture the story and capture the reader — no more, no less.

Most complex is the Guardian, followed by BuzzFeed (well it does like words like ‘unbelievable’ and ‘amazing’). Vox, by contrast, has fairly snappy headlines (“11 things you need to know about Ebola”), as do the other new media sites, Reddit and the Huffington Post.

Headline categorisation

While it’s hard to gauge content, the LIWC can give some idea of what the headline is about based on word categories.

Vox says it’s there to explain the news and it does have a high proportion of insight words (“think”, “know”). The Guardian, by contrast, has more causation words (“because”). Now there’s a subtle difference between causation and insight. My view is that words classed as “insight” are more fact-based (“this is what happened”) whereas insight is more about opinion (“this is why this thing happened”). Both give you an overview, but causation suggests that it’s opinion-led.

This is a subtle distinction but if this is true suggests that the Guardian (and Huffington Post) are likely to have the more opinionated authors, those who (claim to) know the answer. By contrast, Vox, like the BBC, is more neutral, focusing on the facts.

For touchy feely types, Reddit is about the senses (“We’ve been hearing about ebola…”). Of course the main difference of Reddit with the others is that the question (or headline) in this case is posed by one user and answered by others. This will result in varying questioning styles and answers.

Body copy

Let’s go from the headlines now into the meat of the content. So far Vox seems to be doing what it stated — explaining in a fairly neutral way what’s happening, with fairly polished headlines.

Readability

There are different ways to score how easy it is to read an article. These are based on looking at sentence length, complexity (number of syllables) and other factors.

Averaging the outputs I came up with a score, where, like golf, the higher the number the ‘worse’ it is.

As with headlines, the Guardian insists on being complex. Yet Vox isn’t that far off, being the next most complex, in line with my expectations based on those long sentences and non-plain words.

By contrast the BBC is a lot less complex. I did include one article aimed at children on CBBC, but this had a similar readability score to the main BBC News article. The Daily Mail also keeps its writing less complex. Like the BBC it has a broad readership and as such can’t afford to be too complex.

Let’s dig a bit deeper and look at other reasons why the Guardian and others are so complex.

Sentence length distribution

I looked at sentence length partly because of this quote on the GOV.UK blog:

Writing guru Ann Wylie describes research showing that when average sentence length is 14 words, readers understand more than 90% of what they’re reading. At 43 words, comprehension drops to less than 10%.

Cumulative here just means that I keep adding the total in one category to the next. So BuzzFeed has 24% of its sentences in ‘9 and fewer words’, and 51% (24%+27% for ’10-14 words’) fewer than 14 words. The Guardian by contrast (yet again) only has 12% of its sentence as short as 9 words.

Looking at the curves you can see that BuzzFeed has short, punchy sentences and so its curve is steep and peaks early. The Guardian, with long, word sentences, gently curves out as it rambles on. Vox is between the two. That can be a good middle path. Short sentences aren’t always best. They can be distracting.

This method isn’t perfect but with enough data it does give a good indicator — BuzzFeed’s sentences are likely to be understood by more people than the Guardian’s. And Vox’s.

Long sentence split

There’s another way of looking at  sentence length — what’s the overall split between complex and short sentences?

BuzzFeed really stands out for its snappiness, while a 1/3 of the Guardian’s sentences are classed as long. Ouch.

Yet despite having a good readability score, the Daily Mail has sentence length proportions approaching the Guardian’s. We need to find out more.

Adverb use

I believe the road to hell is paved with adverbs, and I will shout it from the rooftops.

Stephen King is just one of many authors and style-guide setters who rail against the adverb, seeing it as a sign of poor writing. Adverbs modify verbs, such as “he quickly walked”. A good writer would generally (and this is a generalism, as there is debate) use a single word than add an adverb. For example, rather than “quickly walked”, they’d use “darted”, “dashed” and so on (as long as the single word is still plain English).

As such I use adverb count as a rough measure of how good the writing is. It can also be seen as how good the sub-editing process is (if any, sad to say), balanced against the need to let an author’s voice be heard.

Reddit has the highest use of adverbs. Not surprising — users aren’t professional writers nor do they have a sub-editor. I’d be surprised if the authors themselves even spent time editing their work. And that’s to be expected, as Reddit is ultimately a messageboard, not a professional publication.

I was surprised at the amount of adverbs in the Huffington Post and the Guardian. Having had the chance to ask a former Guardian sub I was told that the paper, while keen to maintain its style, doesn’t want to mask the author’s voice. With many authors not professional writers (and, being news, they have a short time to compose their material) it’s no wonder that adverbs are allowed.

The BBC, by contrast, is in no rush to break news, nor does not have many guest columnists but instead has professional journalists write most of its content. The Economist is a weekly newspaper so has that increasingly rare luxury of time — time to let writers review and subs to sub. It also aims to have a single, consistent style and voice.

This doesn’t explain the Daily Mail, which sits there, in the middle. But of the 3 articles analysed, 2 were by the Associated Press, which tends to go for a neutral style (unlike the Mail).

Subjects and pronoun use

Finally let’s look at who the authors address and how much of this is personal experience.

Now I’ve not accounted for quotations in this, which by their nature are personal experiences and need attributing (he says).

As before, Reddit as the more social of the news sources lead the way with the personal “I”. And with the question being set by another user, it’s natural to respond to them with “you”. I was surprised that the Huffington Post had a similar proportion, but I wasn’t surprised that the traditional news sources lack the first person.

What can this tell us? GOV.UK tells its writers to address its subjects as “you”, though I couldn’t find the research to say why this is best. As a writer it does feel more personal using “you” but can’t  say why it’s better to the reader. My research on this at Which?, where I had Google Analytics and Omniture data, didn’t lead to any conclusions about user behaviour and the best form of addressing readers.

Instead it’s more as interest to see how the split goes between different organisations, and the divide between the old and new media.

Passive voice

Style guides warn against the use of passive voice and encourage the active voice (ie, “Freddie Starr ate my hamster”, not “A hamster was eaten by Freddie Starr”).

The BBC, the bastion of impartial and neutral news, and so is the most passive (“it was claimed”). A noble idea, but not always as readable. Vox at the other end is the most direct (“Russia denies it is invading”) along with the Mail and Guardian. BuzzFeed doesn’t do as well here (“Authorities in these nations have scrambled to contain the disease”), but its short sentences seem to carry its overall readability.

Summary

Looking at how easy it is to understand a headline, the new media (Vox, Reddit and Huffington Post) win the day. Their headlines were the most polished and appealing to readers, and state clearly that they’ll explain the news.

The new media sites, with the exception of BuzzFeed (“11 Things You Need To Know About The Ebola Epidemic That’s Killing Thousands”), had less complex headlines. Not to say that this meant short headlines — search words have to be crammed it — but shorter words were generally used.

The best overall readability was for the BBC, but in terms of sentence lengths BuzzFeed kept it short and punchy throughout. The Guardian however had long headlines and long sentences, hurting its chances of being widely understood by a wide demographic.

Other observations

Several sites had topic pages, eg the Huffington Post’s MH17 topic, while few had summaries like Vox’s cards or the BBC’s explainers . Topics are pages that collect all pages related to a news story could be found. Yet when I tried to use them to find the ‘best’ page or a summary it was a barren search. Instead it seemed more as a technical solution (grouping similar content) to a technical problem, but not an editorial answer. I preferred the Vox style of an editorial collection summing up the situation.

I ignored images, which the Daily Mail and BuzzFeed have a large number of. I don’t know how this may affect readability. When it comes to online content I’m with Alice (of Wonderland fame), who tired of writing that lacks pictures. I don’t know what effect this has on readership, though I know that images benefit search engine optimisation.

Finally, I didn’t look at overall word length as this would be unfair on Vox. Though this is a good indicator of readability, the way Vox arranged its content meant its multiple pages would count as one according to the analysis programs.

Conclusion

Breaking news

Does all this matter? News sources cater to different audiences so if the Guardian wants a reader base that has to put in a bit of effort to understand what it’s trying to say, then that’s the Guardian’s choice. Me, I prefer to keep things plain.

I also wonder whether complex readability hurts the Guardian’s influence — if readers aren’t clear what’s being said then how can the paper have a great influence? How many people enjoy struggling through an article? If there’s a good point to be made, let alone a tricky question to answer, why make it hard to understand.

I have no beef with Vox. It’s interesting what they’re doing and I single them out because they presented a statement to a room of journalists and it’s a journalist’s job to challenge. But compared with newspapers that have already been explaining the news for years, such as the Economist, it has much to learn. It wasn’t surprising then to hear that Vox was set up by bloggers. Blogging is a different beast to journalism, though as shown by Vox’s rapid rise, it has benefits for grabbing online readers.

So in answer to the question in the headline — is Vox top? The answer follows Betteridge’s law — no. Vox has good headlines but its content is so dense that it is unlikely to attract the broad demographic it apparently aims for.

Instead I see BuzzFeed continuing its success due to its easy-to-read sentences (and so be readable by the widest audience). Yet in contrast to its copy, BuzzFeed’s headlines were long, though at least they described the article.

Yet a quick revisit to Vox showed a different story. While headlines to the explanatory cards in Vox were well written, the news headlines caused a bit of headache when we looked at them. “Europe’s leaders have succeeded in making Greece unimportant” had to be read a couple of times to get the meaning. I wasn’t even sure what I’d get when I clicked on that headline.

Is there a best site, as stated at the start of this article? Horses for courses, but to avoid weasling out, I’d say that the BBC seems to strike the best balance between them, while at the more sensational end BuzzFeed is best. Reddit can be good, but I’d prefer to monitor its news summing up before giving a better answer.

Next time

If I did this again I’d also want to look at:

  • passive voice proportions through a new tool — I don’t like the passive voice analysis in here so would want a second opinion
  • verb phrases per sentence, apparently a better predictor of readability — this would mean building a new analysis tool
  • more data — bigger is better, but I couldn’t/didn’t scrape this time as it would have taken longer than doing it manually

Predictions

I don’t have the traffic data for any of the sites I analysed. Reddit is probably the closest as it gives a score. Of course if anyone working at those sites wants to send me any data I’d gratefully receive it…

Even with this lack of data, I’d still expect:

  • BBC — slower off the mark with news stories as it spends longer polishing them, so it’s worst for breaking news, but it’s the easiest to comprehend news source. Will continue to be a go-to news site of choice, but its CBBC news for children needs to be simplified. If traffic is good its ‘explainers’ may become more popular
  • The Daily Mail — with only one article written by the Mail it’s hard to give a unique distinction for it, but those selected were easy enough to read. Will remain a global news souce
  • The Guardian — plodding headlines and plodding pieces mean that if articles are read, I’m not sure how much will truly be retained and understood. I wonder how many readers skip straight to the comments. While those who understand it seem to love it, its high reading comprehension means its demographics will be much narrower most of the other news sources in this study
  • The Economist — in many ways what Vox is aiming for, each article assumes no prior knowledge and it’ll remain my go-to newspaper for news summaries. If only its headlines were a little more descriptive and it sentences a bit more active, it may become more popular than it is
  • Huffington Post — it’ll continue to be stuck in the middle ground, neither new or old media, it’s both too impersonal and too distant so occupies this niche. A niche that’s not enticing to me
  • BuzzFeed — I expect a reasonable click-through rate for its headlines but as its articles are easy to read users are likely to share them and to read more of them. Expect the news site to grow in popularity. I’m guessing its complex headlines serve its purposes, and I’d be interested to see what testing they’ve done on them
  • Reddit — it has users who address other users, don’t expect a polished (or any response) but can give an easy to digest understanding of the situation (if the article exists). Surprising amount of experts on there, from Arnold Schwarzenegger to research scientists. Its future depends on its readers (which reminds me of something)
  • Vox — it will draw readers in with a good click-through for headlines but will have a high bounce (‘exit) rate and low click-through for the next page due to its hard-to-read format
  • topic pages on news sites — unless top/relevant/best posts are pinned to the top these will mainly serve as useful pages for the authors but too garbled to use for the average reader

*I blame GOV.UK for being able to spot a complex sentence and counting it in a Rain Man-esque manner

Categories
Scientific Research

Better writing measured

We say as writers that we can make writing better, but how can we measure this?

You can use editorial authority, or user research, but I wanted to use a way that was simple to analyse, could be done by anyone, and could justify the work we’d been doing.