Oryx, Spark, Impala and Gertrude - Machine Learning for Business
Mo Data stashed this in Machine Learning
Academic machine learning is all about optimization. Machine learning in a business setting is all about understanding: “My focus is always on how do I understand what the system is doing, come up with new hypotheses about this very complex system, test them, and then use what I’ve learned from those tests to find new ways to improve the system.”
An overview of Cloudera’s current data science tools, including Oryx and Spark for building and serving machine learning models, Gertrude for multivariate testing, and Impala for ludicrously high-performance SQL queries against HDFS.