The selection of the right tool for the right job can take some experience and knowledge. Awareness of the right tool can help save a lot of time, energy and cost. Therefore, this post aims to provide some insight in this area which should allow for the selection of a tool that is appropriate forContinue reading “Pandas vs Apache Spark vs Power BI Desktop: big data performance on a single machine”
Tag Archives: ETL
Data Acquisition
The first process of any machine learning pipeline starts with the process of Extraction, Transformation and Loading (ETL) of data into the system, which is by far my most favourite part of data science. ETL Basics As the name suggests, ETL comprises of the following parts: Data Extraction: This part deals with the acquisition ofContinue reading “Data Acquisition”