This week's inaugural #GWOSCON was a fantastic conference about open source software. The slides for my presentation "Reproducible Data Science with Open Source Tools" are available on GitHub in my talks repository: github.com/jayqi/talks
This week's inaugural #GWOSCON was a fantastic conference about open source software. The slides for my presentation "Reproducible Data Science with Open Source Tools" are available on GitHub in my talks repository: github.com/jayqi/talks
Great post! I appreciate the thorough exploration and discussion of different approaches.
Followup question: Pandas (as of version 2.0) can use PyArrow as a backend instead of numpy. Would using the list PyArrow data type address some of the shortcomings you identified for Pandas?