Random Forests are easy to learn and use for both professionals and lay people – with little research and programming required and may be used by folks without a strong statistical background. Simply put, you can safely make more accurate predictions without most basic mistakes common to other methods.
In the spirit of this International Year of Statistics, the McDonald’s analytical approach speaks to some of the fundamentals of the field. It is a great example of how statistics can dramatically affect a mega-corporation
The YARN Resource Manager service is the central controlling authority for resource management and makes allocation decisions. It exposes a Scheduler API that is specifically designed to negotiate resources and not schedule tasks. Applications can request resources at different layers of the cluster topology such as nodes, racks etc. The scheduler determines how much and where to allocate based on resource availability and the configured sharing policy.
At the moment, the data scientist represents a stage in the evolution of Big Data analysis; a stopgap until the technology emerges that will do the job for him. The progress of what algorithms can do is far outpacing the mortal speed of data scientists, and businesses are taking note of how much time, energy, and lost profit can be saved by alleviating the pressure on the data middleman.
See on Scoop.it – Corporate Challenge of Big Data
Designed specifically to run on a single computer with limited memory1 (DRAM), since its release a few months ago GraphChi has been used to analyze graphs with billions of edges. Running on a single machine means deployment and debugging are simpler. In addition it is no longer necessary to find (optimal) graph partitions that minimize communication between compute nodes – the starting point for many distributed graph computations.
See on strata.oreilly.com
Neo Technology Director of Media and Community Andreas Kollegger says a graph database is sort of like a document database that includes some specific types of structure — namely, information about how each entry is related to other entries. The most obvious type of graph that could be described here is a social graph.
Despite the widespread interest in other types of NoSQL databases such a the BigTable-clone Hbase and the document database MongoDB, Eifrem says that he never sees Neo4j competing with other NoSQL databases in bids. “It’s always Oracle RAC or a homegrown solution,” he says.