How’s Your New Year’s Resolution Going?

Talking with friends last week, many of us sheepishly admitted that we have already “broken” our 2017 resolutions. According to a popularly cited study by Norcross in 2002, only 64% of 150 participants who had made resolutions were able to stick with them beyond a month. A recent Marist poll lists some popular choices for New Year’s resolutions. Among them, being a better person, losing weight, and exercising more ranked high among the thousand or so respondents. I began to wonder: are WordPress.com users showing any observable trends surrounding New Year’s resolutions? When I have questions about content across all of WordPress.com, I turn to our Elasticsearch Index!

To find out whether people were writing about New Year’s resolutions, I searched for posts that feature both the words new and resolution in the title, content, tag name, or category name. I used a boolean query, with both a must and a should clause. Notice that I included a "minimum_number_should_match": 1 in the query. This means that at least one of the conditions of the should clause must be met for the result to be considered a match.

{
"query":{
    "bool":{
        "must":[
            {"term": {"site_id": 1}},
            {"range":{"date_gmt":{"from": "2017-01-01","to": "2017-02-07"}}}
            ],
        "should":[
            {"bool":
                {"must":[
                    {"term":{"tag.name.en":"new"}},
                    {"term":{"tag.name.en":"resolution"}}

                    ]
                }
            },
            {"bool":
                {"must":[
                    {"term":{"category.name.en":"new"}},
                    {"term":{"category.name.en":"resolution"}}
                    ]
                }
            },
            {"bool":
                {"must":[
                    {"term":{"content.en":"new"}},
                    {"term":{"content.en":"resolution"}}
                    ]
                }
            },
            {"bool":
                {"must":[
                    {"term":{"title.en":"new"}},
                    {"term":{"title.en":"resolution"}}
                    ]
                }
            }

        ],
        "minimum_number_should_match": 1
        }
    }
}

I found some! As I suspected, the resulting data set is much larger than the previous studies I mentioned. Our wealth of public content provides a unique perspective on these questions.

To look at the distribution of those posts over time, we need to add an aggregation clause. This query gives us the number of posts that meet the criteria that are posted by day.

{
"query":{
    "bool":{...},
    "aggs": { "d":
               { "date_histogram":
                  { "field": "date_gmt",
                    "interval": "day" }
               }
             }
}

post-over-time
For reference, I’ve also made plot of posts about dogs by day. Notice that while the number of posts about dogs consistently dips on the weekends, the trend remains constant over time. However, resolution posts show a definite, quick drop off. Whether or not these writers are still sticking to their resolutions is unclear from this, but they certainly aren’t writing about them as much six weeks later! So far this year, 50% of all posts about New Year’s resolutions occurred within the first 10 days of January.

Interestingly, resolution posts don’t immediately show the “weekend dip” that we see in dog posts. In general, posting tends to drop a bit on Saturdays and Sundays. By adding the following aggregation to our query, we find the Sunday and Monday are the most popular days for writing about resolutions.

{
"query":{
    "bool":{...},
    "aggs": { "d":
               { "terms":
                  { "field": "date_token.day_of_week" }
                }
             }
}

In this aggregation clause, I use the key date_token. Our Elasticsearch Index has a token for both the GMT time and the user’s local time. This allows us to look at calendar trends even though we have users all over the world.

So, do our users share the same goals as those in the Marist poll? (Being a better person, losing weight, exercising more, and spending less money were the top four items from the Marist poll.) We have to do some content and tag analysis to find out. For this task, I’m going to switch from our Elasticsearch Index to using Python to analyze a sample our New Year’s resolution posts.

For ease of calculation, I took a small sample of ~2500 posts from those we identified with our Elasticsearch queries. To identify some common goals of our users, I used a Python library called TextBlob, and extracted two and three word phrases, like “social media” or “read more books” from the post content. I calculated the frequency of these phrases, in order to pick out the most common ones. Once I had the top 200 phrases, I had some more filtering to do. Phrases like “Happy New Year” and “New Year’s Resolutions” of course were very common, but don’t give any information about our user’s resolutions. Other common but irrelevant phrases included “I think,” “I would like,” “really good,” etc. Identifying nouns and verbs seemed like a great next step to discerning the relevant phrases that actually communicated the writer’s intent. By knowing the parts of speech in the phrases, I could identify actions, and exclude phrases that contained a pronoun, for example. After this process, I had some great examples; “comfort zone,” “fresh start,” and “lose weight” were among them. Also among them were “something new,” “new me,” and “weight loss.” Because of the obvious similarities, I manually grouped them into broader categories and totalled them. The chart below shows the most popular categories in our sample.
popular_goals.png

Weight loss certainly is a popular goal! We also see being a better person and exercise in our sample. Interestingly, our users also have resolutions about social media, reading and, not surprisingly, blogging and writing. Even my own resolution to drink more water appears on the list!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s