GPU-Accelerated Data Science and SQL in Python

Felipe and Rodrigo Aramburu, co-founders of BlazingSQL, join Matt Rocklin and Hugo Bowne-Anderson to discuss GPU-accelerated data science and SQL in Python.

BlazingSQL is the second-largest contributor to RAPIDS, a GPU data science ecosystem, and has built a distributed SQL engine leveraging both cuDF (a pandas-like dataframe on GPUs) and Dask.

GPUs are notoriously tricky, but they don’t have to be. We’ll discuss how BlazingSQL and RAPIDS provide a quick and easy onramp to GPUs. We’ll demonstrate how with BlazingSQL, data scientists can:

  1. Run performant queries on raw datasets (no import);
  2. Learn how the usage of open-source standards provides opportunities for interoperability with a wide berth of technologies;
  3. Launch their own GPU-accelerated with relative ease.

If you know a bit of SQL, you’ll see how quickly you can turbocharge your ETL workloads with minimal effort.

If you’re comfortable with Dask or other forms of distributed compute, you’ll also learn how to enable distributed query execution of 1 or even hundreds of GPUs and what the future of fully distributed ETL might look like with workflows and pipelines!

Join us this Thursday, August 7th at 5pm US Eastern time on our YouTube channel as we dive into a Pythonic ecosystem for accelerating data science workloads through the usage of GPUs, and see how data scientists can seamlessly improve performance and scale up their workloads from gigabytes to terabytes with relative ease.

BlazingSQL logo
Dask logo with matrix
Share

Sign up for updates