Data Engineering


  • Looker NYC Meetup

    Like any company, Automattic is constantly on a journey to get better: sometimes we have the good fortune of finding improvement in leaps and bounds, but most of the time, we move slowly, we make small changes, finding iterative wins and moving down the to‑do list.  I think probably this is how most progress happens:…

  • Reflections From Spark + AI Summit 2018

    Here are our favorite talks from Spark + AI Summit 2018

  • Real-Time Elasticsearch Indexing on WordPress.com

    Love databases, indexing, and Elasticsearch gymnastics? Greg Brown walks us through the indexing sausage factory on WordPress.com.

  • Bulk Log Analytics With Hive

    Leveraging the distributed powers of MapReduce to perform custom log analysis or some one-time queries on the raw data is fast and easy and you don’t even have to build a complicated ETL process to do it. The data engineering team at WordPress.com recently used this approch to query tens of billions of log lines…