We are the Data Team at Automattic, and this our blog!

Automattic powers WordPress.com and several other products.  Here on the Data Team, we do some very cool stuff, and we’d like to share some of that with you! Here are just a few of the things we do:

  • We use smart statistics to understand how users can become successful bloggers.
  • We use network science to connect readers with content they’ll enjoy and to unveil the community structure of WordPress.com.
  • We develop search algorithms to help users find relevant content on our platform.
  • We build the data pipelines that help the rest of the company build a sustainable business.

Over 26% of the web is powered by WordPress! On an average day, the sites on WordPress.com alone stream about 1.5 billion events, and our largest Hadoop cluster does around 32TB of reads and writes. Our largest Elasticsearch cluster has 45 nodes and handles about 35 million queries on a daily basis. We’re continuously developing this data eco system, and we’ll share the journey with you.

Whether you are a fellow data professional, or looking to learn more about data science and engineering, come join us for breakfast!

P.S. Sound fun? Come work with us!