Description
Big Data Analysis in Python is having its renaissance. It all started with NumPy, which is also one of the building blocks behind the tool I am presenting in this article. In 2006, Big Data was a…
Summary
- Yet another Python library for Data Analysis that You Should Know About — and no, I am not talking about Spark or Dask.
- Each month I find a new tool, which I am eager to learn.
- dv = vaex.from_csv(file_path, convert=True, chunk_size=5_000_000) This function automatically created an HDF5 file and persist it to disk.
- dv.plot1d(dv.col2, figsize=(14, 7)) Virtual columns Vaex creates a virtual column when adding a new column, — a column that doesn’t take the main memory as it is computed on the fly.