Join us Tonight! Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas
Join PyData Pittsburgh tonight at the Swartz Center for our November event — Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas. Ehsan Totoni, CTO and Co-Founder of Bodo.ai, will discuss how Bodo DataFrames brings high-performance computing (HPC) techniques like MPI and JIT compilation to the familiar Pandas API—allowing data scientists to scale Python workloads from millions to billions of rows without rewriting their code.
About the talk:
Pandas is a popular library for data scientists but it struggles with large datasets; programs either become too slow or run out of memory. In this talk, we introduce Bodo DataFrames (https://github.com/bodo-ai/Bodo) as a drop-in replacement for the Pandas library that uses high performance computing (HPC) based techniques such as Message Passing Interface (MPI) and JIT compilation for acceleration and scaling. We give an overview of its architecture and explain how it avoids the problems of Pandas (while keeping user code the same), go over concrete examples, and finally discuss current limitations. This talk is for Pandas users who would like to run their code on larger data while avoiding frustrating code rewrites to other APIs. Basic knowledge of Pandas and Python is recommended.
Time:
5:30pm – Doors Open
6:00pm – Talk, Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas




