🏢

Amazon Redshift

AnalyticsCloud Data Warehouse

Redshift is a data warehouse that stores prepared analytical data for repeated aggregation and dashboards. It gives you a compute layer dedicated to analytics rather than operational transactions.

▶Architecture Diagram

📊 Data Flow

🪣S3 (데이터)

🔗Glue (ETL)

🏢리더 노드

⚙️컴퓨트 1

⚙️컴퓨트 2

📊BI 도구

Dashed line animations indicate the flow direction of data or requests

Why do you need it?

If BI queries run on the same operational database as user requests, reporting workloads and product traffic start interfering with each other. Without a separate analytical store, the operations system becomes the first thing that shakes as reports grow.

Why did this approach emerge?

Running reports on top of operational databases hit clear limits as scale grew. This is why dedicated analytics warehouse services like Redshift, with specialized storage structures and distributed processing engines, emerged.

How does it work inside?

Redshift is a data warehouse that performs fast aggregations over structured and semi-structured data through columnar storage and massively parallel processing. Data is typically loaded through S3 and Glue, and BI tools connect to the cluster for recurring reports and dashboards.

Boundaries & Distinctions

Redshift and Athena both support SQL analytics, but the loading strategy is different. Redshift excels when you load prepared data into a warehouse for repeated analysis, while Athena excels at reading S3 files directly. If you can model and load data for recurring dashboards and aggregates, look at Redshift; if you want exploratory queries without a load step, look at Athena.

When should you use it?

Well-suited for dashboards, scheduled reports, large-scale aggregations, and analysis that combines data from multiple sources. Overkill when only running occasional investigative queries.

Business intelligenceData warehousingBatch analyticsData marts