Sign up FAST!
Big Data Technologies
Change this image
Press the Follow Stash button to follow this Stash
Under the hood: Data diving with Scuba #facebook #scuba #bigdata
Probabilistic programming does in 50 lines of code what used to take thousands
Apache Kafka is the circulatory system in use at LinkedIn.
Lambda Architecture explained
Is Apache Spark going to replace Hadoop?
Is it time for a new data warehouse?
Announcing Pulsar: Real-time Analytics at Scale
Introduction to OpenStack
Automating the Data Scientists
No subject appears to be more controversial to distributed systems engineers than the oft-misunderstood CAP theorem
Apache Spark is a platform for processing big data through streaming
NumPy is the fundamental package for scientific computing with Python
Big Data Analytics: Time For New Tools
How to use big data and Hadoop to drive telecom product development
Spark lights a fire under big data processing
Is there a schema behind a MongoDB database? Maybe...
MS Excel was the always the BI tool of choice - is it back?
The Lambda Architecture has its merits, but alternatives are worth exploring.
Replication in the Kafka publish-subscribe messaging service
9 Lessons: Picking the Right NoSQL Tools
CausalImpact: A new open-source package for estimating causal effects in time series
Leveraging Big Data Techniques for Big Data Quality, #Hadoop, #MapReduce #YARN
INFATools, speeds up the process of creating staging mappings, sessions and workflows in Informatica
Pandas is an open source, BSD-licensed library providing data structures and analysis tools for the Python programming language
Stinger - the next thing in Big Data architecture
Docker providing mini virtual machines (VMs) - Managing the excitement
Hadoop 101: An Explanation of the Hadoop Ecosystem
An apples-to-apples comparison of commercial Hadoop systems
Using the Data Restructuring Wizard for Unstructured Data
When to use Hadoop (and when not to)
Comparing Statistical Software - great guide
Google’s latest invention: a data-warehousing system named Mesa
A curated list of awesome big data frameworks, resources and other awesomeness....
War of the Hadoop SQL Engines
Hadoop: The Components You Need to Know
Hadoop is an immature technology and not originally designed with security in mind
Kubernetes - a solution for orchestrating and managing Docker containers at scale
I know this starts to get Big Brother-ish [Alpine is] doing that for your benefit, in a way.
Tamr for housebreaking your big data
Learn R : 12 Books and Online Resources
Data Mining Tools: Perl, Matlab, SAS, Pig, Impala, Shark, Clojure, Scalding, Elasticsearch, Spark MLlib, Graphlab, Shogun and Weka
Nati Shalom's Blog: Making Hadoop Run Faster
Stephen Fry Explains Cloud Computing
Can you have a truly scalable database without going NoSQL? Yes! - Google SQL F1
Apache Spark is another increasingly popular alternative to replace MapReduce
Why is Schema on Read so Useful?
How Disney built a big data platform on a startup budget
Big Step Full Metal Cloud brings supercomputing power, providing some 20 to 100 per cent more performance than any virtual Cloud
We looooooove FREE!!! Here are some of the best free data mining tools - dance to that tune
Gartner Magic Quadrant for Data Integration Tools
| © PandaWhale, Inc. 2017. All rights reserved.