DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Building a Production-Ready Serverless App on Google Cloud (Part 2: The Data Contract)

Building a Production-Ready Serverless App on Google Cloud (Part 2: The Data Contract)

7
Comments
4 min read
Quantified Self: Building a Production-Grade ETL Pipeline for 10+ Wearables

Quantified Self: Building a Production-Grade ETL Pipeline for 10+ Wearables

2
Comments
4 min read
Our Data Extraction Pipeline Worked Perfectly… Until Month 6

Our Data Extraction Pipeline Worked Perfectly… Until Month 6

1
Comments
2 min read
Share of Shelf Analysis: How to Scrape Zappos Search Results

Share of Shelf Analysis: How to Scrape Zappos Search Results

1
Comments
4 min read
Iterator Patterns: How to Process Millions of Records Without Running Out of Memory

Iterator Patterns: How to Process Millions of Records Without Running Out of Memory

1
Comments
5 min read
Why My Metrics Pipeline with Telegraf Didn’t Work (and What I Learned)

Why My Metrics Pipeline with Telegraf Didn’t Work (and What I Learned)

2
Comments
2 min read
Python was too slow for 10M rows—So I built a C-Bridge (and found the hidden data loss)

Python was too slow for 10M rows—So I built a C-Bridge (and found the hidden data loss)

Comments
2 min read
9 Data Engineering Challenges That Kill Pipelines in Production (And How I approached Them With Pure Snowflake SQL)

9 Data Engineering Challenges That Kill Pipelines in Production (And How I approached Them With Pure Snowflake SQL)

1
Comments
14 min read
How I stopped bad data from reaching my warehouse using a single Airflow task

How I stopped bad data from reaching my warehouse using a single Airflow task

Comments
4 min read
Stop Babysitting Servers: Build a Scalable Serverless Data Lake on AWS

Stop Babysitting Servers: Build a Scalable Serverless Data Lake on AWS

Comments
2 min read
History of Kafka the message broker

History of Kafka the message broker

2
Comments
3 min read
Introducing QueryFlux: Open-Source Universal Multi-Engine Query Router and SQL Proxy

Introducing QueryFlux: Open-Source Universal Multi-Engine Query Router and SQL Proxy

1
Comments
7 min read
How I Built a Performance Dashboard for a Multi-Office Chiropractic Practice

How I Built a Performance Dashboard for a Multi-Office Chiropractic Practice

1
Comments 2
5 min read
Optimizing Continuous Aggregate Performance for Large Datasets

Optimizing Continuous Aggregate Performance for Large Datasets

Comments
4 min read
How to Build a Real-Time DynamoDB to S3 Analytics Pipeline with Apache Iceberg

How to Build a Real-Time DynamoDB to S3 Analytics Pipeline with Apache Iceberg

2
Comments
8 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.