November Event – Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas
PyData Pittsburgh is excited to host our November event – Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas.
Join us on Tuesday, November 4th, as as Ehsan Totoni, CTO and Co-Founder of Bodo.ai, discusses how Bodo DataFrames brings high-performance computing (HPC) techniques like MPI and JIT compilation to the familiar Pandas API—allowing data scientists to scale Python workloads from millions to billions of rows without rewriting their code.
For latest information about time/location and to register please see our meetup page:
About the talk:
Pandas is a popular library for data scientists but it struggles with large datasets; programs either become too slow or run out of memory. In this talk, we introduce Bodo DataFrames (https://github.com/bodo-ai/Bodo) as a drop-in replacement for the Pandas library that uses high performance computing (HPC) based techniques such as Message Passing Interface (MPI) and JIT compilation for acceleration and scaling. We give an overview of its architecture and explain how it avoids the problems of Pandas (while keeping user code the same), go over concrete examples, and finally discuss current limitations. This talk is for Pandas users who would like to run their code on larger data while avoiding frustrating code rewrites to other APIs. Basic knowledge of Pandas and Python is recommended.
About the Speaker:
Ehsan Totoni is an entrepreneur, computer science researcher, and software engineer working on democratization of High Performance Computing (HPC) for data engineering, data science and AI/ML. Ehsan received his PhD in computer science from the University of Illinois at Urbana-Champaign, working on various aspects of HPC and Parallel Computing. He then worked as a research scientist at Intel Labs and Carnegie Mellon University, focusing on programming systems to address the gap between programmer productivity and computing performance. Ehsan co-founded Bodo.ai in 2019 and is advancing Bodo’s mission of bringing HPC to all data applications.