The Dell | Cloudera Solution
Below is a post originally written by Barton George at his blog and discusses The Dell | Cloudera Solution :-
Data continues to grow at an exponential rate and no place is this more obvious than in the Web space. Not only is the amount exploding but so is the form data’s taking whether that’s transactional, documents, IT/OT, images, audio, text, video etc. Additionally much of this new data is unstructured/ semi-structured which traditional relational databases were not built to deal with.
Enter Hadoop, an Apache open source project which, when combined with Map Reduce allows the analysis of entire data sets, rather than sample sizes, of structured and unstructured data types. Hadoop lets you chomp thru mountains of data faster and get to insights that drive business advantage quicker. It can provide near “real-time” data analytics for click-stream data, location data, logs, rich data, marketing analytics, image processing, social media association, text processing etc. More specifically, Hadoop is particularly suited for applications such as:
- Search Quality — search attempts vs. structured data analysis; pattern recognition
- Recommendation engine — batch processing; filtering and prediction (ie use information to predict what similar users like)
- Ad-targeting – batch processing; linear scalability
- Thread analysis for spam fighting and detecting click fraud — batch processing of huge datasets; pattern recognition
- Data “sandbox” – “dump” all data in Hadoop; batch processing (ie analysis, filtering, aggregations etc.); pattern recognition
The Dell | Cloudera solution
The solution is comprised of Cloudera’s distribution of Hadoop, running on optimized Dell PowerEdge C2100 servers with Dell PowerConnect 6248 switch, delivered with joint service and support. Dell offers two flavors of this big data solution: Cloudera’s distribution with the free download of Hadoop software, and Cloudera’s enterprise version of Hadoop that comes with a charge.
It comes with its own “crowbar” and DIY option
The Dell | Cloudera solution for Apache Hadoop also comes with Crowbar, the recently open-sourced Dell-developed software, which provides the necessary tools and automation to manage the complete lifecycle of Hadoop environments. Crowbar manages the Hadoop deployment from the initial server boot to the configuration of the main Hadoop components allowing users to complete bare metal deployment of multi-node Hadoop environments in a matter of hours, as opposed to days. Once the initial deployment is complete, Crowbar can be used to maintain, expand, and architect a complete data analytics solution, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting.
The solution also comes with a reference architecture and deployment guide, so you can assemble it yourself, or Dell can build and deploy the solution for you, including rack and stack, delivery and implementation. Share this post on The Dell | Cloudera Solution