PyData Triangle March 2022 Meetup

Virtual


Details
PyData Triangle welcomes you to another exciting event.

This will be an online event. You must RSVP to this meetup event in order to see the Teams/Zoom URL. Revisit this page on the day of the meeting, as the URL might change.

Speakers:

Gus Cavanaugh
Gordon Hart
YOU: Lightning Talks (Sign-up for a 5 minute lightning talk slot at the meeting by posting in the chat. Or pre-sign-up by posting a comment into this announcement.)
Schedule:
6:00-6:15 announcements
6:15-7:15 Gus Cavanaugh
7:15-8:15 Gordon Hart
8:15-8:30 Lightning talks

The PyData code of conduct ( http://pydata.org/code-of-conduct.html ) is enforced at this Meetup. Attendees violating these rules may be asked to leave the meetup at the sole discretion of the meetup organizer.

NOTE: This meeting will be recorded.

Please propose a presentation or speaker for a future PyData Triangle meetup. Contact any of the organizers, Yanlei Peng, Dhruv Sakalley, Gene Ferruzza, or Mark Hutchinson through meetup messages.

Follow us on twitter at: https://twitter.com/pydatatriangle

***

Presenter: Gus Cavanaugh

Title: Scale PyData To Clusters & GPUs with Dask

Presentation Overview:
It’s common for Python users to end up with more data than can fit on their laptop. Sampling is great, but sometimes you need to process everything. In the past, python users didn’t have much choice beyond Spark, but that isn’t the case anymore.

This talk is an introduction to Dask. You’ll learn how to scale PyData code from a laptop to a cluster and back. We’ll include some fun examples focused on finance and GPUs

Bio:
Gus is a recovering consultant that fled to software development five years ago. He recently reunited with former Anaconda colleagues at Coiled, where they provide software and support for commercial and community users of Dask.

Presenter: Gordon Hart

Title: Test-Centric ML Model Development

Presentation Overview:
Current ML model evaluation techniques are falling short. Model evaluation using only global metrics like accuracy or F1 score produces a low-resolution picture of model performance and fails to describe performance across types of cases, attributes, and scenarios.

It is rapidly becoming vital for production ML teams to have a full understanding of when and how their models fail and to track these behaviors across different model versions to be able to identify regressions.

We’ve seen great results from teams implementing unit and functional testing techniques in their model testing. In this talk, we’ll cover why systematic unit testing is important and how to effectively test ML system behavior.

Bio:
After being burned one too many times by unexpected mission-critical model performance in production, Gordon co-founded Kolena to build an ML testing and evaluation platform to tell you what you need to know before your model hits the real world.