#protip If you want to make your #Python pandas code run faster . . . . rewrite it in #RStats using data.table
3
4
15
Replying to @ChristosArgyrop

Feb 13, 2022 · 8:56 PM UTC

1
1
Oh ... did not know they were around here. This is a comparison I did a few weeks ago ; there is probably a difference in memory footprint , but I have not gotten around to studying this in detail.
Apply is 3-6x slower than agg in #Python But if one is not doing simple columnar summaries, one must use apply. The equivalent apply code (the one using .SD) in #rstats using data.table (second row) is much faster (especially if one groups by several factors e.g. 6x) #BigData