Pandas vs Apache Spark vs Power BI Desktop: big data performance on a single machine
The selection of the right tool for the right job can take some experience and knowledge. Awareness of the right tool can help save a lot of time, energy and cost. Therefore, this post aims to provide some insight in this area which should allow for the selection of a tool that is appropriate for…
Create your own Kafka cluster: Data streaming
At some point in your data science/engineering career, you will eventually come across the need to handle streaming data. The challenge with data streams is the velocity and volume of data coming in. Therefore, you will be left with no choice other than to manage it in some way. Apache Kafka is a great tool…
Create your own Hadoop cluster: Living in a parallel dimension
At some point or the other, during your data science/data analytics journey you will come across the need to process huge amount of data. During these times, it would be useful to be able to harness the power of parallel processing. In simple words, parallel processing relates to the harnessing of the processing power of…
Something went wrong. Please refresh the page and/or try again.
