Sign up FAST! Login

10 Reasons You Don’t Need Hadoop For Your Data Analysis - Alternatives Must Try Before Using It

elephant logging

1.    Size Of Overall Data - Hadoop is designed to work efficiently on large data sets. 2.     Speed Of Data Growth (Growth Velocity) - How fast is this data growing? Is this data growing with very fast pace? What will be the size of data after few months or years from now?3. Consider Archiving - Data archiving is the process of moving stale data to a separate data storage for long-term retention (if required).4. Consider Purging Data - At times we are busy collecting data and not really sure that how much we should keep.5. All Data Is Not Important - You may be tempted to keep all the data you have for your business. 6. Be Mindful Of What You Want To Collect As Data - In general, if you have some relational data there is a chance that you may get it from multiple sources and not all of it needs to be stored in your data warehouse.7. Hire Analysts Who Understand The Business - Hadoop will be almost useless if you data analysts do not understand what to extract out of it. Invest in people who understand business.8. Use Statistical Sampling For Decision Making - This technique will not provide accurate results, however it may be used for getting a high level understanding of a large data set.9. Have You Really Hit The Edge Of Relational Database Processing? - Before you really explore other venues, I would like you to see if relational db is able to handle it.10. Partition Data - Partitioning is supported by most of the popular open source relational databases11. Try Database Sharding Approach For Relational Database - the last resort for hitting the edge of a relational database processing speed.

(okay, that seemed to be eleven)

Stashed in: Big Data!

To save this post, select a stash from drop-down menu or type in a new one:

Hadoop does seem like overkill for most things. 

You May Also Like: