A Brand Image Analysis of WordPress and Automattic on Twitter

As a data scientist, I spend a lot of time analyzing how our users interact with WordPress.com. However, WordPress.com isn’t the only place to gain insight into how people use and talk about our services. Many WordPress.org and WordPress.com discussions take place on social media. Analyzing these discussions can help us understand what our users are … Continue reading A Brand Image Analysis of WordPress and Automattic on Twitter

Bulk Log Analytics With Hive

Leveraging the distributed powers of MapReduce to perform custom log analysis or some one-time queries on the raw data is fast and easy and you don't even have to build a complicated ETL process to do it. The data engineering team at WordPress.com recently used this approch to query tens of billions of log lines with just a couple minutes of work.

Continue reading

Evolution of a Plot: Better Data Visualization, One Step at a Time

The goal of data visualization is to transform numbers into insights. However, default data visualization output often disappoints. Sometimes, the graph shows irrelevant data or misses important aspects; sometimes, the graph lacks context; sometimes, it’s difficult to read. Often, data practitioners “feel” that something isn’t right with the graph, but cannot pinpoint the problem. In … Continue reading Evolution of a Plot: Better Data Visualization, One Step at a Time

Network Science at Automattic: Mapping the Communities of WP.com — Methodology

If you have read our analysis on the communities of WordPress.com and would like to know more about the methods behind it, then keep on reading! In this -- slightly more technical -- post, I will show how we constructed, filtered, projected, and clustered a network around WordPress.com users and blogs. Building the Network of … Continue reading Network Science at Automattic: Mapping the Communities of WP.com — Methodology

Intro to Search: Anatomy of a search engine

Welcome to the second post in our "Intro to Search"-series! Today, we'll dig into the building blocks of search engines to give you an idea of just how we identify what posts to show readers using WordPress.com's search tool. A (web) search engine connects users with relevant documents. This process generally has five main stages: … Continue reading Intro to Search: Anatomy of a search engine