Big Data is not just about volumes of data; it is the ability to both collect information and extract meaning from it. The businesses that are prospering in today's economy are those able to take advantage of Big Data. In the next ten years, it will be the ability to use Big Data that will make or break a business. According to contributing writer to Forbes, Daniel Newman, "it … [Read more...] about Big Data, not Oil, is King in Today’s Economy
Apache
Apache Spark – A Basic Understanding
Before diving deep into how Apache Spark works, lets understand the jargon of Apache Spark Job: A piece of code which reads some input from HDFS or local, performs some computation on the data and writes some output data. Stages: Jobs are divided into stages. Stages are classified as a Map or reduce stages (Its easier to understand if you have worked on Hadoop and want to … [Read more...] about Apache Spark – A Basic Understanding
Apache Flink: The Next Distributed Data Processing Revolution?
Disclaimer: The results are valid only in the case when network attached storage is used in the computing cluster. The amount of data is growing significantly over the past few years. It is not feasible for only one machine to process large amounts of data. Therefore, the need of distributed data processing frameworks is growing. It all started back in 2011 when the first … [Read more...] about Apache Flink: The Next Distributed Data Processing Revolution?
What Effect Will Deep Learning Have on Business?
One thing that could have a deep impact on business is deep learning. Deep learning can be thought of as a subfield of machine learning. In specific, this form of machine learning was influenced by the study of the human brain. The algorithms involved are designed to mimic how the human brain operates to allow a machine to learn in the same way. This is done through a system … [Read more...] about What Effect Will Deep Learning Have on Business?
How to Overcome Big Data Analytics Limitations With Hadoop
Hadoop is an open source project that was developed by Apache back in 2011. The initial version had a variety of bugs, so a more stable version was introduced in August. Hadoop is a great tool for big data analytics, because it is highly scalable, flexible and cost-effective. However, there are also some challenges big data analytics professionals need to be aware of. The good … [Read more...] about How to Overcome Big Data Analytics Limitations With Hadoop