Data Analytics continues to be a critical area for most modern applications and there are many solutions available, therefore it’s important for application owners to be aware of their options. Given that I focus on Amazon Web Services, I’m interested in solutions that can be launched on compute infrastructure in AWS and that access data stored in the cloud.
For this article, I focused on Data Analytics platforms Starburst Galaxy and Snowflake with compute and data storage infrastructure deployed on AWS. Similar to other articles, I will focus on areas such as performance, infrastructure setup and cost, with the following angle: “What do I need to do in order to run this workload, how fast will it be and how much will I pay for it?”
- Starburst Galaxy is a managed data analytics platform built on top of the Trino query engine (formerly known as PrestoSQL). It is available on multiple cloud providers, such as AWS, GCP or Azure. It can also query data stored in a wide range of options, including these cloud providers and other data sources.
- Snowflake is a managed data analytics platform that also supports launching compute infrastructure in the cloud (e.g. AWS, GCP and Azure) and a number of data sources, including storage in these mentioned cloud providers.
For the tests I executed, I focused on compute infrastructure deployed on Amazon Web Services, in the N. Virginia region and data stored in AWS S3 in the same region.
Data and Infrastructure Setup
Below are the infrastructure configuration details in order to execute these tests on both Starburst Galaxy and Snowflake.
Data
- The data used in this set of tests corresponds to a 1TB TPC-DS dataset, in Apache Iceberg table format.
- Files used in the Iceberg tables were stored in an S3 bucket owned by Concurrency Labs, in the N. Virginia region. The dataset resulted in 12,696 files, for a total of 234 GB. File size is expected, given they are stored in compressed format.
- Both Starburst Galaxy and Snowflake accessed data through a catalog managed by the AWS Glue service.
- Iceberg tables were created using DDL statements following this pattern:
CREATE TABLE glue.<table-name> (<column-definition) WITH (format=PARQUET, location=<s3-location>, type=ICEBERG;)
- Data was inserted into S3 using as a source data stored in Parquet format available in the tpcds.sf1000 schema managed by Starburst Galaxy. This was done using INSERT SQL statements for each TPC-DS table, with the following pattern:
INSERT INTO glue.<schema>.<table-name> SELECT <columns> FROM tpcds.sf1000.<table-name>;
- Row count for each table was confirmed to comply with the TPC-DS standard (approximately 6.35 billion records in total, stored in 24 tables).
- The 99 TPC-DS queries used in these tests correspond to the TPC-DS specification.
Compute Infrastructure
Compute infrastructure was launched using the Starburst Galaxy and Snowflake GUI. Both platforms were configured to launch compute resources on Amazon Web Services in the N. Virginia AWS region (us-east-1) and were granted cross-account permissions (IAM Roles) to access data stored in an S3 bucket owned by Concurrency Labs and an AWS Glue catalog in the same AWS account as the S3 bucket.
Starburst Galaxy
- Size: Large (16 credits/hour).
- Cloud provider: AWS - N. Virginia
- Cluster type: Accelerated (Warp Speed)
- Number of workers: 16
- Query result caching: off.
Snowflake
- Size: X-Large Warehouse (16 credits/hour).
- Cloud provider: AWS - N. Virginia
- Type: Standard
- Number of clusters:
- 1 scenario with multi-cluster disabled (16 credits)
- 1 scenario with scaling enabled to provision between 1 and 3 clusters (16-48 credits). This warehouse had 3 clusters provisioned throughout the whole query execution, given the load induced into the system.
- Query acceleration: off
- Query result caching: off (ran statement
ALTER SESSION SET USE_CACHED_RESULT=FALSE
)
Both Starburst Galaxy and Snowflake deliver a well documented way to launch compute infrastructure. However, none of these platforms expose more details about the underlying compute capacity available in each deployment type (i.e. vCPUs, memory, etc.), which would allow users to have more context regarding other cloud-based solutions, such as AWS EC2 or AWS EMR.
Test Scenarios
All scenarios were executed with 5 concurrent users sequentially executing the 99 TPC-DS queries, iterating through the whole set of queries a number of 5 times per user (i.e. each user executes queries 1 through 99 and repeats the sequence 5 times). Below are the three platform scenarios tested:
Scenario | Description |
---|---|
TPC-DS 1 TB Iceberg - Starburst Galaxy Large | 5 concurrent users executed 5 iterations of 99 TPC-DS queries sequentially, using Starburst Galaxy on a Large cluster of type Warp Speed. |
TPC-DS 1 TB Iceberg - Snowflake X-Large - 1 cluster | Same as above, but using a Snowflake X-Large Standard Warehouse with multi-cluster disabled. |
TPC-DS 1 TB Iceberg - Snowflake X-Large - 1-3 clusters | Same as above, but using a Snowflake X-Large Standard Warehouse with multi-cluster scaling enabled between 1 and 3 clusters. |
Tests were executed using Apache JMeter, a load testing tool designed to trigger and measure performance of concurrent requests. Tests were triggered from an EC2 instance launched in the N. Virginia region and the results below were calculated using the elapsed time reported for each transaction by JMeter and recorded in the EC2 instance.
Test Results
TPC-DS 1 TB Iceberg
The table below shows the average, maximum and minimum execution time in seconds, for the 25 executions triggered for each query (5 concurrent users executing 5 sequential sets of TPC-DS queries).
Starburst Galaxy - Large - WarpSpeed | Snowflake - X-Large Standard Warehouse - 1 cluster | Snowflake - X-Large Standard Warehouse - 3 clusters | |||||||||
avg | max | min | avg | max | min | avg | max | min | |||
q01 | 7 | 18 | 3 | q01 | 12 | 47 | 7 | q01 | 5 | 11 | 3 |
q02 | 3 | 7 | 2 | q02 | 16 | 40 | 5 | q02 | 4 | 8 | 3 |
q03 | 2 | 5 | 1 | q03 | 11 | 64 | 4 | q03 | 3 | 6 | 1 |
q04 | 29 | 47 | 12 | q04 | 68 | 143 | 39 | q04 | 42 | 51 | 32 |
q05 | 12 | 18 | 6 | q05 | 25 | 63 | 5 | q05 | 5 | 11 | 3 |
q06 | 4 | 7 | 3 | q06 | 16 | 56 | 6 | q06 | 4 | 6 | 3 |
q07 | 6 | 10 | 3 | q07 | 13 | 69 | 3 | q07 | 4 | 6 | 2 |
q08 | 5 | 7 | 4 | q08 | 14 | 62 | 3 | q08 | 3 | 4 | 2 |
q09 | 7 | 13 | 4 | q09 | 38 | 60 | 22 | q09 | 21 | 28 | 15 |
q10 | 5 | 9 | 4 | q10 | 22 | 62 | 5 | q10 | 4 | 8 | 3 |
q11 | 17 | 23 | 8 | q11 | 42 | 75 | 21 | q11 | 25 | 31 | 19 |
q12 | 3 | 5 | 2 | q12 | 65 | 1257 | 3 | q12 | 2 | 4 | 1 |
q13 | 10 | 21 | 6 | q13 | 66 | 1258 | 5 | q13 | 4 | 6 | 3 |
q14 | 35 | 50 | 14 | q14 | 69 | 808 | 26 | q14 | 22 | 29 | 17 |
q15 | 4 | 8 | 3 | q15 | 47 | 795 | 4 | q15 | 5 | 9 | 4 |
q16 | 9 | 15 | 5 | q16 | 12 | 43 | 3 | q16 | 4 | 8 | 2 |
q17 | 7 | 13 | 4 | q17 | 19 | 48 | 6 | q17 | 5 | 8 | 4 |
q18 | 9 | 16 | 5 | q18 | 15 | 44 | 5 | q18 | 5 | 8 | 4 |
q19 | 4 | 9 | 3 | q19 | 38 | 532 | 6 | q19 | 5 | 8 | 3 |
q20 | 3 | 6 | 2 | q20 | 248 | 1305 | 3 | q20 | 2 | 4 | 1 |
q21 | 3 | 4 | 2 | q21 | 110 | 838 | 3 | q21 | 2 | 4 | 1 |
q22 | 10 | 14 | 6 | q22 | 122 | 842 | 3 | q22 | 4 | 6 | 3 |
q23 | 46 | 69 | 20 | q23 | 220 | 2660 | 27 | q23 | 29 | 40 | 21 |
q24 | 21 | 29 | 10 | q24 | 2651 | 4362 | 2132 | q24 | 2223 | 4344 | 2123 |
q25 | 6 | 16 | 4 | q25 | 195 | 1122 | 7 | q25 | 4 | 7 | 3 |
q26 | 4 | 7 | 3 | q26 | 143 | 1297 | 5 | q26 | 3 | 5 | 2 |
q27 | 4 | 8 | 3 | q27 | 129 | 782 | 3 | q27 | 3 | 6 | 2 |
q28 | 8 | 13 | 3 | q28 | 88 | 872 | 13 | q28 | 14 | 18 | 12 |
q29 | 8 | 25 | 5 | q29 | 145 | 1299 | 7 | q29 | 5 | 7 | 4 |
q30 | 5 | 8 | 4 | q30 | 373 | 2523 | 3 | q30 | 5 | 10 | 3 |
q31 | 6 | 9 | 4 | q31 | 100 | 818 | 5 | q31 | 6 | 12 | 4 |
q32 | 2 | 3 | 2 | q32 | 9 | 50 | 2 | q32 | 1 | 3 | 1 |
q33 | 8 | 17 | 4 | q33 | 78 | 1293 | 3 | q33 | 4 | 10 | 2 |
q34 | 3 | 4 | 3 | q34 | 11 | 34 | 1 | q34 | 2 | 5 | 1 |
q35 | 6 | 12 | 4 | q35 | 48 | 841 | 7 | q35 | 8 | 12 | 5 |
q36 | 4 | 7 | 3 | q36 | 64 | 1291 | 4 | q36 | 2 | 4 | 1 |
q37 | 2 | 4 | 2 | q37 | 9 | 30 | 3 | q37 | 2 | 4 | 1 |
q38 | 8 | 17 | 4 | q38 | 58 | 850 | 16 | q38 | 14 | 21 | 10 |
q39 | 7 | 8 | 5 | q39 | 8 | 19 | 3 | q39 | 3 | 5 | 2 |
q40 | 3 | 4 | 2 | q40 | 10 | 28 | 3 | q40 | 3 | 5 | 2 |
q41 | 1 | 1 | 1 | q41 | 10 | 60 | 1 | q41 | 1 | 3 | 0 |
q42 | 2 | 4 | 1 | q42 | 6 | 38 | 1 | q42 | 1 | 4 | 1 |
q43 | 3 | 5 | 2 | q43 | 10 | 39 | 2 | q43 | 2 | 2 | 1 |
q44 | 3 | 5 | 2 | q44 | 63 | 1224 | 7 | q44 | 7 | 11 | 5 |
q45 | 5 | 8 | 4 | q45 | 13 | 75 | 5 | q45 | 4 | 7 | 3 |
q46 | 5 | 11 | 3 | q46 | 44 | 760 | 5 | q46 | 5 | 7 | 4 |
q47 | 32 | 48 | 15 | q47 | 16 | 31 | 9 | q47 | 8 | 14 | 7 |
q48 | 9 | 34 | 5 | q48 | 11 | 20 | 4 | q48 | 3 | 6 | 3 |
q49 | 6 | 10 | 4 | q49 | 11 | 22 | 5 | q49 | 6 | 12 | 3 |
q50 | 17 | 31 | 9 | q50 | 11 | 18 | 6 | q50 | 5 | 7 | 4 |
q51 | 7 | 14 | 3 | q51 | 12 | 26 | 5 | q51 | 5 | 7 | 4 |
q52 | 2 | 3 | 1 | q52 | 7 | 39 | 1 | q52 | 1 | 3 | 1 |
q53 | 3 | 7 | 2 | q53 | 7 | 31 | 2 | q53 | 2 | 4 | 1 |
q54 | 5 | 8 | 4 | q54 | 9 | 27 | 3 | q54 | 3 | 6 | 2 |
q55 | 2 | 2 | 1 | q55 | 8 | 21 | 2 | q55 | 1 | 3 | 1 |
q56 | 6 | 12 | 3 | q56 | 9 | 24 | 3 | q56 | 3 | 8 | 2 |
q57 | 19 | 30 | 9 | q57 | 12 | 32 | 5 | q57 | 5 | 9 | 3 |
q58 | 9 | 26 | 4 | q58 | 18 | 118 | 3 | q58 | 3 | 6 | 2 |
q59 | 6 | 14 | 2 | q59 | 20 | 216 | 6 | q59 | 6 | 9 | 4 |
q60 | 7 | 13 | 4 | q60 | 21 | 163 | 5 | q60 | 5 | 7 | 3 |
q61 | 6 | 12 | 4 | q61 | 18 | 123 | 4 | q61 | 5 | 7 | 4 |
q62 | 5 | 13 | 3 | q62 | 37 | 491 | 2 | q62 | 2 | 4 | 1 |
q63 | 3 | 7 | 2 | q63 | 26 | 149 | 2 | q63 | 2 | 4 | 1 |
q64 | 17 | 23 | 13 | q64 | 102 | 569 | 22 | q64 | 24 | 29 | 20 |
q65 | 11 | 16 | 5 | q65 | 22 | 85 | 10 | q65 | 8 | 11 | 6 |
q66 | 5 | 8 | 3 | q66 | 27 | 169 | 2 | q66 | 4 | 8 | 2 |
q67 | 69 | 92 | 36 | q67 | 183 | 324 | 149 | q67 | 158 | 176 | 141 |
q68 | 5 | 10 | 3 | q68 | 26 | 154 | 6 | q68 | 4 | 6 | 3 |
q69 | 5 | 7 | 4 | q69 | 36 | 153 | 6 | q69 | 4 | 8 | 3 |
q70 | 7 | 14 | 5 | q70 | 23 | 55 | 4 | q70 | 4 | 7 | 3 |
q71 | 4 | 8 | 3 | q71 | 20 | 145 | 3 | q71 | 2 | 5 | 1 |
q72 | 28 | 46 | 15 | q72 | 49 | 184 | 19 | q72 | 19 | 24 | 14 |
q73 | 3 | 5 | 3 | q73 | 42 | 306 | 2 | q73 | 2 | 3 | 1 |
q74 | 9 | 18 | 5 | q74 | 45 | 169 | 20 | q74 | 20 | 27 | 15 |
q75 | 16 | 29 | 9 | q75 | 32 | 62 | 16 | q75 | 16 | 23 | 11 |
q76 | 9 | 21 | 4 | q76 | 37 | 297 | 6 | q76 | 7 | 9 | 4 |
q77 | 6 | 9 | 4 | q77 | 13 | 53 | 3 | q77 | 4 | 6 | 2 |
q78 | 29 | 43 | 12 | q78 | 121 | 151 | 93 | q78 | 73 | 106 | 53 |
q79 | 5 | 11 | 4 | q79 | 62 | 1241 | 4 | q79 | 5 | 10 | 4 |
q80 | 7 | 10 | 5 | q80 | 75 | 1306 | 10 | q80 | 8 | 12 | 6 |
q81 | 6 | 14 | 4 | q81 | 55 | 793 | 4 | q81 | 4 | 6 | 3 |
q82 | 3 | 6 | 2 | q82 | 52 | 838 | 2 | q82 | 2 | 4 | 2 |
q83 | 6 | 19 | 4 | q83 | 12 | 29 | 2 | q83 | 3 | 7 | 1 |
q84 | 5 | 15 | 3 | q84 | 20 | 109 | 3 | q84 | 3 | 6 | 2 |
q85 | 10 | 20 | 7 | q85 | 19 | 101 | 5 | q85 | 6 | 11 | 4 |
q86 | 5 | 8 | 3 | q86 | 9 | 22 | 1 | q86 | 2 | 4 | 1 |
q87 | 7 | 15 | 3 | q87 | 33 | 52 | 18 | q87 | 22 | 30 | 17 |
q88 | 8 | 13 | 5 | q88 | 78 | 1286 | 14 | q88 | 16 | 21 | 13 |
q89 | 3 | 6 | 2 | q89 | 10 | 28 | 2 | q89 | 2 | 5 | 1 |
q90 | 3 | 6 | 2 | q90 | 41 | 815 | 1 | q90 | 3 | 8 | 1 |
q91 | 5 | 16 | 4 | q91 | 63 | 1293 | 1 | q91 | 2 | 4 | 1 |
q92 | 2 | 9 | 2 | q92 | 8 | 19 | 2 | q92 | 3 | 5 | 2 |
q93 | 22 | 39 | 11 | q93 | 44 | 835 | 5 | q93 | 6 | 9 | 5 |
q94 | 6 | 15 | 4 | q94 | 16 | 75 | 2 | q94 | 3 | 7 | 2 |
q95 | 13 | 31 | 6 | q95 | 12 | 30 | 3 | q95 | 4 | 7 | 2 |
q96 | 3 | 8 | 2 | q96 | 7 | 11 | 2 | q96 | 2 | 5 | 2 |
q97 | 10 | 16 | 6 | q97 | 18 | 48 | 7 | q97 | 9 | 14 | 6 |
q98 | 3 | 10 | 2 | q98 | 17 | 77 | 2 | q98 | 2 | 3 | 1 |
q99 | 7 | 14 | 4 | q99 | 10 | 53 | 2 | q99 | 3 | 6 | 2 |
Avg (sec) | 9 | 16 | 5 | Avg (sec) | 73 | 448 | 30 | Avg (sec) | 31 | 56 | 28 |
Total (sec) | 861 | 1,549 | 482 | Total (sec) | 7,261 | 44,358 | 2,967 | Total (sec) | 3,043 | 5,555 | 2,741 |
Average execution time (sec) | Comparisons | |||||
---|---|---|---|---|---|---|
Starburst Galaxy | Snowflake - 1 cluster | Snowflake - 3 clusters | Galaxy vs Snowflake 1 cluster | Galaxy vs Snowflake 3 clusters | Snowflake 3 clusters vs. 1 cluster | |
q01 | 7 | 12 | 5 | 0.55 | 1.33 | 0.42 |
q02 | 3 | 16 | 4 | 0.22 | 0.81 | 0.27 |
q03 | 2 | 11 | 3 | 0.22 | 0.92 | 0.24 |
q04 | 29 | 68 | 42 | 0.43 | 0.69 | 0.61 |
q05 | 12 | 25 | 5 | 0.47 | 2.56 | 0.18 |
q06 | 4 | 16 | 4 | 0.24 | 0.95 | 0.25 |
q07 | 6 | 13 | 4 | 0.45 | 1.66 | 0.27 |
q08 | 5 | 14 | 3 | 0.36 | 1.93 | 0.19 |
q09 | 7 | 38 | 21 | 0.18 | 0.32 | 0.55 |
q10 | 5 | 22 | 4 | 0.25 | 1.32 | 0.19 |
q11 | 17 | 42 | 25 | 0.4 | 0.66 | 0.6 |
q12 | 3 | 65 | 2 | 0.04 | 1.45 | 0.03 |
q13 | 10 | 66 | 4 | 0.15 | 2.26 | 0.07 |
q14 | 35 | 69 | 22 | 0.5 | 1.6 | 0.31 |
q15 | 4 | 47 | 5 | 0.08 | 0.8 | 0.11 |
q16 | 9 | 12 | 4 | 0.74 | 2.49 | 0.3 |
q17 | 7 | 19 | 5 | 0.36 | 1.26 | 0.29 |
q18 | 9 | 15 | 5 | 0.59 | 1.83 | 0.32 |
q19 | 4 | 38 | 5 | 0.11 | 0.98 | 0.12 |
q20 | 3 | 248 | 2 | 0.01 | 1.42 | 0.01 |
q21 | 3 | 110 | 2 | 0.02 | 1.8 | 0.01 |
q22 | 10 | 122 | 4 | 0.08 | 2.4 | 0.03 |
q23 | 46 | 220 | 29 | 0.21 | 1.62 | 0.13 |
q24 | N/A | N/A | N/A | N/A | N/A | N/A |
q25 | 6 | 195 | 4 | 0.03 | 1.3 | 0.02 |
q26 | 4 | 143 | 3 | 0.03 | 1.52 | 0.02 |
q27 | 4 | 129 | 3 | 0.03 | 1.54 | 0.02 |
q28 | 8 | 88 | 14 | 0.09 | 0.58 | 0.16 |
q29 | 8 | 145 | 5 | 0.06 | 1.75 | 0.03 |
q30 | 5 | 373 | 5 | 0.01 | 1.17 | 0.01 |
q31 | 6 | 100 | 6 | 0.06 | 1.05 | 0.06 |
q32 | 2 | 9 | 1 | 0.24 | 1.62 | 0.15 |
q33 | 8 | 78 | 4 | 0.1 | 1.83 | 0.05 |
q34 | 3 | 11 | 2 | 0.28 | 1.48 | 0.19 |
q35 | 6 | 48 | 8 | 0.12 | 0.77 | 0.16 |
q36 | 4 | 64 | 2 | 0.06 | 1.9 | 0.03 |
q37 | 2 | 9 | 2 | 0.24 | 1.21 | 0.2 |
q38 | 8 | 58 | 14 | 0.13 | 0.56 | 0.24 |
q39 | 7 | 8 | 3 | 0.86 | 2.1 | 0.41 |
q40 | 3 | 10 | 3 | 0.31 | 1.11 | 0.28 |
q41 | 1 | 10 | 1 | 0.09 | 1.29 | 0.07 |
q42 | 2 | 6 | 1 | 0.27 | 1.7 | 0.16 |
q43 | 3 | 10 | 2 | 0.28 | 1.87 | 0.15 |
q44 | 3 | 63 | 7 | 0.05 | 0.41 | 0.11 |
q45 | 5 | 13 | 4 | 0.39 | 1.29 | 0.3 |
q46 | 5 | 44 | 5 | 0.11 | 1.06 | 0.11 |
q47 | 32 | 16 | 8 | 1.99 | 3.88 | 0.51 |
q48 | 9 | 11 | 3 | 0.81 | 2.56 | 0.31 |
q49 | 6 | 11 | 6 | 0.55 | 1.05 | 0.52 |
q50 | 17 | 11 | 5 | 1.63 | 3.87 | 0.42 |
q51 | 7 | 12 | 5 | 0.55 | 1.35 | 0.41 |
q52 | 2 | 7 | 1 | 0.26 | 1.89 | 0.14 |
q53 | 3 | 7 | 2 | 0.37 | 1.63 | 0.23 |
q54 | 5 | 9 | 3 | 0.59 | 1.58 | 0.38 |
q55 | 2 | 8 | 1 | 0.2 | 1.6 | 0.13 |
q56 | 6 | 9 | 3 | 0.65 | 1.85 | 0.35 |
q57 | 19 | 12 | 5 | 1.56 | 4.04 | 0.39 |
q58 | 9 | 18 | 3 | 0.5 | 2.62 | 0.19 |
q59 | 6 | 20 | 6 | 0.3 | 1.07 | 0.28 |
q60 | 7 | 21 | 5 | 0.32 | 1.47 | 0.22 |
q61 | 6 | 18 | 5 | 0.33 | 1.24 | 0.26 |
q62 | 5 | 37 | 2 | 0.14 | 2.43 | 0.06 |
q63 | 3 | 26 | 2 | 0.11 | 1.75 | 0.06 |
q64 | 17 | 102 | 24 | 0.17 | 0.73 | 0.23 |
q65 | 11 | 22 | 8 | 0.49 | 1.3 | 0.38 |
q66 | 5 | 27 | 4 | 0.19 | 1.19 | 0.16 |
q67 | 69 | 183 | 158 | 0.38 | 0.44 | 0.86 |
q68 | 5 | 26 | 4 | 0.17 | 1.07 | 0.16 |
q69 | 5 | 36 | 4 | 0.14 | 1.3 | 0.11 |
q70 | 7 | 23 | 4 | 0.31 | 1.89 | 0.16 |
q71 | 4 | 20 | 2 | 0.18 | 1.59 | 0.11 |
q72 | 28 | 49 | 19 | 0.57 | 1.51 | 0.38 |
q73 | 3 | 42 | 2 | 0.08 | 2 | 0.04 |
q74 | 9 | 45 | 20 | 0.19 | 0.43 | 0.45 |
q75 | 16 | 32 | 16 | 0.5 | 1.03 | 0.49 |
q76 | 9 | 37 | 7 | 0.23 | 1.3 | 0.18 |
q77 | 6 | 13 | 4 | 0.46 | 1.61 | 0.28 |
q78 | 29 | 121 | 73 | 0.24 | 0.4 | 0.61 |
q79 | 5 | 62 | 5 | 0.08 | 1 | 0.08 |
q80 | 7 | 75 | 8 | 0.1 | 0.94 | 0.1 |
q81 | 6 | 55 | 4 | 0.1 | 1.28 | 0.08 |
q82 | 3 | 52 | 2 | 0.06 | 1.32 | 0.04 |
q83 | 6 | 12 | 3 | 0.46 | 2.07 | 0.22 |
q84 | 5 | 20 | 3 | 0.22 | 1.5 | 0.15 |
q85 | 10 | 19 | 6 | 0.51 | 1.67 | 0.3 |
q86 | 5 | 9 | 2 | 0.52 | 2.76 | 0.19 |
q87 | 7 | 33 | 22 | 0.2 | 0.3 | 0.65 |
q88 | 8 | 78 | 16 | 0.11 | 0.51 | 0.21 |
q89 | 3 | 10 | 2 | 0.31 | 1.63 | 0.19 |
q90 | 3 | 41 | 3 | 0.07 | 1.07 | 0.07 |
q91 | 5 | 63 | 2 | 0.07 | 2.09 | 0.04 |
q92 | 2 | 8 | 3 | 0.29 | 0.81 | 0.35 |
q93 | 22 | 44 | 6 | 0.51 | 3.66 | 0.14 |
q94 | 6 | 16 | 3 | 0.37 | 2.19 | 0.17 |
q95 | 13 | 12 | 4 | 1.12 | 3.5 | 0.32 |
q96 | 3 | 7 | 2 | 0.42 | 1.22 | 0.34 |
q97 | 10 | 18 | 9 | 0.57 | 1.11 | 0.51 |
q98 | 3 | 17 | 2 | 0.16 | 1.56 | 0.11 |
q99 | 7 | 10 | 3 | 0.69 | 2.4 | 0.29 |
Avg | 9 | 47 | 8 | 0.18 | 1.02 | 0.18 |
Median | 6 | 22 | 4 | 0.24 | 1.46 | 0.19 |
Max | 69 | 373 | 158 | 1.99 | 4.04 | 0.86 |
Min | 1 | 6 | 1 | 0.01 | 0.3 | 0.01 |
Total (avg) | 840 | 4,610 | 820 |
q24 took significantly longer to execute in Snowflake scenarios compared to Galaxy (>100x on average). Therefore this query was removed from the comparison tables, in order to avoid data distortion and to make the total comparisons more accurate. Even though this would require more troubleshooting, data from similar tests using Snowflake interacting with other catalogs show better performance for this query, therefore this could be related to the Snowflake integration with AWS Glue as a data catalog.
The table below shows the average query execution time relative to each scenario, as a percentage (excluding q24):
relative to: | |||
---|---|---|---|
Starburst Galaxy Large - WarpSpeed |
Snowflake X-Large Standard Warehouse 1 cluster |
Snowflake X-Large Standard Warehouse 3 clusters |
|
Starburst Galaxy Large - WarpSpeed |
N/A | 18% | 102% |
Snowflake X-Large Standard Warehouse 1 cluster |
555% | N/A | 562% |
Snowflake X-Large Standard Warehouse 3 clusters |
98% | 18% | N/A |
Observations:
- q24 took > 100x to complete in Snowflake scenarios compared to Starburst Galaxy.
- All queries executed successfully for both Starburst Galaxy and Snowflake.
- Starburst Galaxy was on average 5.55x faster compared to Snowflake X-Large running 1 cluster (excluding q24 from this comparison).
- On average, Starburst Galaxy performance was virtually equal compared to Snowflake X-Large running 3 clusters (excluding q24 from this comparison).
Cost Analysis
The following table compares the cost of running a Large Starburst Galaxy cluster (16 credits/hour) and Snowflake Warehouses (1 and 3 clusters) in the AWS N. Virginia region. The cost per credit below is based on the publicly available documentation for both platforms.
Starburst Galaxy Large
Component | Price Dimension | Usage | Hourly Cost | Monthly Cost |
---|---|---|---|---|
Data store | S3 Standard Storage | 234 GB (Iceberg format) | $0.01 | $5 |
Starburst Galaxy Large Cluster | Credits per hour ($2.80) | Large cluster: 16 credits per hour | $44.80 | $32,256 |
Data Transfer | S3 to EC2 - Intra-Regional Data Transfer | N/A | $0.00 | $0.00 |
Total: | $44.81 | $32,261 |
Snowflake X-Large Warehouse - 1 cluster
Component | Price Dimension | Usage | Hourly Cost | Monthly Cost |
---|---|---|---|---|
Data store | S3 Standard Storage | 234 GB (Iceberg format) | $0.01 | $5 |
Snowflake X-Large Warehouse (1 cluster) | Credits per hour ($3.00) | X-Large warehouse (1 cluster): 16 credits per hour | $48.00 | $34,560 |
Data Transfer | S3 to EC2 - Intra-Regional Data Transfer | N/A | $0.00 | $0.00 |
Total: | $48.01 | $34,565 |
Snowflake X-Large Warehouse - 3 clusters
Component | Price Dimension | Usage | Hourly Cost | Monthly Cost |
---|---|---|---|---|
Data store | S3 Standard Storage | 234 GB (Iceberg format) | $0.01 | $5 |
Snowflake X-Large Warehouse (3 clusters) | Credits per hour ($3.00) | X-Large warehouse (3 clusters): 48 credits per hour | $144.00 | $103,680 |
Data Transfer | S3 to EC2 - Intra-Regional Data Transfer | N/A | $0.00 | $0.00 |
Total: | $144.01 | $103,685 |
It is highly recommended to configure the Auto Suspend feature - for both Galaxy and Snowflake, within a range of 3-5 minutes in order to avoid the possibility of accidentally leaving compute resources running when not being utilized. For AWS deployments, it’s also highly recommended to deploy compute resources and storage (in this case, S3) in the same AWS region, in order to avoid Intra-Regional Data Transfer cost and reduce latency as much as possible.
Considering all required components (storage, compute, etc.), owning an always-on Starburst Galaxy Large cluster would cost approximately $32.3K per month, while an equivalent Snowflake X-Large warehouse (1 cluster, 16 credits) would cost $34.6K. The Snowflake warehouse that had equivalent performance to Starburst Galaxy (Large Warehouse with 3 clusters - 48 credits) would cost approximately $104K per month, or 3.2x compared to Starburst Galaxy.
Conclusions
- Both platforms are relatively simple to launch using their respective GUIs. When using data stored in AWS, it is required to configure cross-account IAM Roles in order to grant these platforms access to the data that will be analyzed.
- Neither of these two platforms exposes more details about the underlying compute capacity available in each deployment type (i.e. vCPUs, memory, etc.). This information would be useful, since it would allow application owners to have more context regarding other cloud-based solutions, such as AWS Redshift, EC2 or EMR. The main comparison that can be made against other cloud-based platforms is based on cost relative to performance.
- All TPC-DS queries executed successfully for Starburst Galaxy and Snowflake.
- However, q24 took approximately >100x to complete in Snowflake scenarios compared to Starburst Galaxy, therefore it was removed from comparison calculations.
- Excluding q24, Starburst Galaxy Large (16 workers) was on average 5.55x faster compared to Snowflake X-Large running 1 cluster (16 credits).
- Excluding q24, Starburst Galaxy Large (16 workers) had on average equal performance compared to Snowflake X-Large running 3 clusters (48 credits).
- Considering all required components (storage, compute, etc.), running an always-on Starburst Galaxy Large cluster would cost approximately $32.3K per month, while an equivalent Snowflake X-Large warehouse (1 cluster, 16 credits) would cost $34.6K.
- The Snowflake warehouse that had equivalent performance to Starburst Galaxy (Snowflake Large Warehouse with 3 clusters - 48 credits) would cost $104K per month, or approximately 3.2x compared to Starburst Galaxy.
Do you need help evaluating the right Data Analytics platform or optimizing your Big Data workloads in the cloud?
I will help you find the right Data Analytics platform for your business needs and optimize your Big Data workloads in the cloud to make sure they deliver the right balance between performance and cost for your business. Click on the button below to schedule a free consultation or use the contact form.