At some point in your data science/engineering career, you will eventually come across the need to handle streaming data. The challenge with data streams is the velocity and volume of data coming in. Therefore, you will be left with no choice other than to manage it in some way. Apache Kafka is a great toolContinue reading “Create your own Kafka cluster: Data streaming”
Category Archives: Intermediate
Create your own Hadoop cluster: Living in a parallel dimension
At some point or the other, during your data science/data analytics journey you will come across the need to process huge amount of data. During these times, it would be useful to be able to harness the power of parallel processing. In simple words, parallel processing relates to the harnessing of the processing power ofContinue reading “Create your own Hadoop cluster: Living in a parallel dimension”