Blog – (the-moose)-machine-learning

Pandas vs Apache Spark vs Power BI Desktop: big data performance on a single machine

The selection of the right tool for the right job can take some experience and knowledge. Awareness of the right tool can help save a lot of time, energy and cost. Therefore, this post aims to provide some insight in this area which should allow for the selection of a tool that is appropriate for…

July 29, 2021July 30, 2021

ETL

Create your own Kafka cluster: Data streaming

At some point in your data science/engineering career, you will eventually come across the need to handle streaming data. The challenge with data streams is the velocity and volume of data coming in. Therefore, you will be left with no choice other than to manage it in some way. Apache Kafka is a great tool…

February 5, 2021

Intermediate

Create your own Hadoop cluster: Living in a parallel dimension

At some point or the other, during your data science/data analytics journey you will come across the need to process huge amount of data. During these times, it would be useful to be able to harness the power of parallel processing. In simple words, parallel processing relates to the harnessing of the processing power of…

February 1, 2021February 1, 2021

Something went wrong. Please refresh the page and/or try again.

About Me

I like tinkering with technology.

Subscribe to My Blog

Get new content delivered directly to your inbox.