About Course Master the skills of Big Data, NoSQL and Data Science at once and become a successful Big Data Scientist with access to 1 courses at once for a lifetime! Start "our #ourne" now$ List of Courses present in this combo pack
Hadoop Architect Training : All in 1 Combo Course: Hadoop Developer, Hadoop Analyst, Hadoop Adminis and Hadoop Testing
R Programming Training
Mahout Training
Data cience Training: !uilding Recommender ystems
tatistics and Probability Probabilit y Training Training
Apache olr Training
plun" Training
Apache torm Training
plun" Admin Training
H!ase Training
Cassandra Training
MongoD! Training
Apache par", cala Training
%e" &eatures' •
A comprehensive, comprehensive, in-depth combo of of Big Data + Data Science + No-SQL No-SQL courses courses including as many as ! niches, highly endorsed and top-paying technology courses
"ntensive Learning on #adoop $ #adoop Architect Architect %raining $ All in &ombo &ourse 'hich includes #adoop Developer, #adoop Analyst, #adoop Administration and #adoop %esting, %esting, ( programming %raining, )ahout %raining, Data Science %raining* Building (ecommender Systems, Statistics and robability %raining, Apache Solr %raining, %raining, Splun %raining, %raining, apache Storm %raining, %raining, Splun admin %raining, %raining, #Base %raining, %raining, &assandra %raining, )ongoDB %raining and Apache Spar and Scala %raining
./ hours of #igh-Quality in-depth 0ideo 1-Learning Sessions
232 hours of Lab 14ercises
"ntellipaat roprietary 0) for Lifetime and free cloud access for ! months for performing e4ercises5
/36 of e4tensive learning through #ands-on e4ercises, ro7ect 8or, Assignments and Qui99es
%he training 'ill prepare you for multiple rofessional &ertification 14ams*
Cloudera Certication: &&A Spar and #adoop Developer, &&A#, R &&A#, R Certification, )ahout &ertification, &ertification, &loudera &ertification :&&*DS;,, Apache Storm &ertification :&&*DS; &ertification,, &loudera Apache #Base &ertification, &ertification, Apache Cassandra Professional Certification, )ongoDB &ertification, &ertification, Apache Spark Certification •
2 Lifetime Support 'ith (apid roblem (esolution =uaranteed
Lifetime Access Access to 0ideos, %utorials and &ourse )aterial
=uidance to (esume reparation and >ob Assistance
Step -by- Step "nstallation of multiple Soft'ares
&ourse &ompletion &ertificate from "ntellipaat
About Big Data, Data Science ( Combo Course %hrough this e4ceptionally elaborative course, learners can ac?uire outstanding sills re?uired by Big Data $Data Scientist 14pert and gain in-depth no'ledge on Development, Administration Administration and Analysis profile and integration of multiple systems together5=aining e4pertise in as many as ! technologies at one time on a single order is the ultimate ticet to your dream 7ob, top-notch company and huge earnings5 "ntellipaat@s All in ne Big Data and Data Science &ombo course endo's you 'ith the most endorsed technologies lie #adoop, Spar, Storm, Scala, NoSQL, )ahout, Splun, Sol r, Data Science, ( rogramming and core statistics and probability5%his probability5 %his training course is a have-it-all pacage to produce silled, competent and leading Big Data Scientist and Architects5 Architects5 1nrolling for this course 'ill give individuals in-depth no'ledge and scope of being identified by the top multinationals 'orld'ide5
Project Work:
Hadoop Projects
1. Project $ 8oring 'ith )ap (educe, (educe, #ive, S?oop
Problem Statement –
"ntensive Learning on #adoop $ #adoop Architect Architect %raining $ All in &ombo &ourse 'hich includes #adoop Developer, #adoop Analyst, #adoop Administration and #adoop %esting, %esting, ( programming %raining, )ahout %raining, Data Science %raining* Building (ecommender Systems, Statistics and robability %raining, Apache Solr %raining, %raining, Splun %raining, %raining, apache Storm %raining, %raining, Splun admin %raining, %raining, #Base %raining, %raining, &assandra %raining, )ongoDB %raining and Apache Spar and Scala %raining
./ hours of #igh-Quality in-depth 0ideo 1-Learning Sessions
232 hours of Lab 14ercises
"ntellipaat roprietary 0) for Lifetime and free cloud access for ! months for performing e4ercises5
/36 of e4tensive learning through #ands-on e4ercises, ro7ect 8or, Assignments and Qui99es
%he training 'ill prepare you for multiple rofessional &ertification 14ams*
Cloudera Certication: &&A Spar and #adoop Developer, &&A#, R &&A#, R Certification, )ahout &ertification, &ertification, &loudera &ertification :&&*DS;,, Apache Storm &ertification :&&*DS; &ertification,, &loudera Apache #Base &ertification, &ertification, Apache Cassandra Professional Certification, )ongoDB &ertification, &ertification, Apache Spark Certification •
2 Lifetime Support 'ith (apid roblem (esolution =uaranteed
Lifetime Access Access to 0ideos, %utorials and &ourse )aterial
=uidance to (esume reparation and >ob Assistance
Step -by- Step "nstallation of multiple Soft'ares
&ourse &ompletion &ertificate from "ntellipaat
About Big Data, Data Science ( Combo Course %hrough this e4ceptionally elaborative course, learners can ac?uire outstanding sills re?uired by Big Data $Data Scientist 14pert and gain in-depth no'ledge on Development, Administration Administration and Analysis profile and integration of multiple systems together5=aining e4pertise in as many as ! technologies at one time on a single order is the ultimate ticet to your dream 7ob, top-notch company and huge earnings5 "ntellipaat@s All in ne Big Data and Data Science &ombo course endo's you 'ith the most endorsed technologies lie #adoop, Spar, Storm, Scala, NoSQL, )ahout, Splun, Sol r, Data Science, ( rogramming and core statistics and probability5%his probability5 %his training course is a have-it-all pacage to produce silled, competent and leading Big Data Scientist and Architects5 Architects5 1nrolling for this course 'ill give individuals in-depth no'ledge and scope of being identified by the top multinationals 'orld'ide5
Project Work:
Hadoop Projects
1. Project $ 8oring 'ith )ap (educe, (educe, #ive, S?oop
Problem Statement –
"t describes ho' to import )ySQL data using S?oop and ?uerying it using hive and also describes ho' to run the 'ord count )ap(educe 7ob5
. Project – 8or on )ovie lens data for finding top records
!ata – )ovie Lens Dataset
Problem Statement – "t includes* •
8rite a )ap(educe program to find the top 3 movies from the u5data file
&reate the same top 3 movies using "= by loading u5data into pig
&reate the same top 3 movies using #"01 by loading u5data into #"01
". Project – #adoop arn arn ro7ect $ 1nd to 1nd o&
Problem Statement – "t includes* •
"mport )ovie data Append the data
#o' to use s?oop commands to bring the data into the #DCS
1nd to 1nd flo' of transaction data
#o' to process the real 'ord data or a huge amount of data using )ap(educe program in terms of the movie etc5
#. Project – artitioning %ables
Problem Statement – "t describes the parting and #o' to perform portioning5 "t includes* •
)anual artitioning
Dynamic artitioning
$. Project – Sales &ommission
!ata – Sales – Sales Problem Statement –
"n this 'e calculate the commission according to the sales5
%. Project – &onnecting entaho 'ith #adoop 1cosystem
Problem Statement – "t includes* •
Quic vervie' of 1%L and B"
&onfiguring entaho to 'or 'ith #adoop Distribution
Loading data into #adoop cluster
%ransforming data into #adoop cluster
14tracting data from #adoop &luster
&. Project – )ulti-node &luster Setup
Problem Statement – "t includes follo'ing actions* •
#adoop )ulti Node &luster Setup using Ama9on ec $ &reating 2 node cluster setup
(unning )ap (educe >obs on &luster
'. Project – #adoop %esting using )(
Problem Statement $ "t describes ho' to test map reduce codes 'ith )( unit5
(. Project – #adoop 8eblog Analytics
!ata – 8eblogs
Problem Statement – %he goal is to enable the participants to have a feel of the actual data sets in a production environment and ho' to load the data into a #adoop cluster using various techni?ues5 nce data is loaded, the ne4t goal is to perform basic analytics on this data5
) Pro*rammin* Project – (estaurant (evenue rediction
!ata – )e*enue Data set Problem Statement – "t predicts the annual restaurant sales based on the ob7ective measurements5 "t uses follo'ing data fields* •
pening Date
%ype of the &ity
%ype of the (estaurant
%hree categories of bfuscated Data
(evenue "t also includes*
Data vervie'
Data Cields
1valuation using ()S1
Ceature 1ngineering Selection
!ata Science Projects: Project 1+ ,nderstandin* Cold Start Problem in !ata Science •
Algorithms for (ecommender
8ays of (ecommendation
%ypes of (ecommendation -&ollaborative Ciltering Based (ecommendation, &ontent-Based (ecommendation
&old Start roblem
Project + )ecommendation for -oie/ Summar0 •
(ecommendation for movie
%'o %ypes of redictions $ (ating rediction, "tem rediction
"mportant Approaches* )emory Based and )odel-Based
Eno'ing Fser Based )ethods in E-Nearest Neighbor
Fnderstanding "tem Based )ethod
)atri4 Cactori9ation
Decomposition of Singular 0alue
Data Science ro7ect discussion
&ollaboration Ciltering
Business 0ariables vervie'
SP Project – Data Analysis ro7ect
!ata – Sales Problem Statement – "t includes the follo'ing actions* •
Fnderstand the business solutions
Discussion 'ith the 'arehouse team
Data &ollection G Storage
Data &leaning
Build a #ypothesis %ree around the business problem
roduce the final result5
2pache Solr Project – Cunction Queries
Problem Statement ( +t describes that how to use function ueries in Solr, su--ose an inde. store the dimensions in meters ., ", / of some h"-othetical bo.es with arbitrar" names stored in 0eld bo.name! Su--ose we want to search for bo. matching name 0ndbo. but ranked according to *olumes of bo.es!
Splunk Project: •
%he Splun ro7ect, after finishing this training course, 'ill let you create a report and dashboard 'ith the te4t file having employee details5
ou 'ill perform various ro' operations to fetch data as per your re?uirements and use important Splun commands on the file to e4tract certain fields5
ther significant aspects of this pro7ect are editing the event, adding tags, searching event 'ith tag names and saving tag search5
Splunk 2dmin Project – &ield .traction Problem Statement – +t includes' •
About Cield 14traction Cield 14tractor Ftility
Cield 14traction page in Splun 8eb
&onfigure field e4traction in configuration files etc5
2pache Storm Projects: •
(eal-time ro7ect on Storm
%he ro7ect Bolt Blue rint
H3ase Project – +ntegrate 2i*e and 3a*a with 2Base Problem Statement – 4his -ro#ect describes that how to integrate hi*e and #a*a wit h 2Base! +t includes following actions' •
"nstallation of #Base
&reation of %able
>ava rogram to create the table in #Base
)anaging the #Base %able 'ith #ive
Bul "mport etc5
-on*o!3 Project – 3a*a MongoDB +ntegration Problem Statement – +t creates a table to insert the *i deo 0le using the #a*a -rogram! &or this it -erforms following actions' •
"nstallation of >ava Adding )ongoDB >ava &onnector etc5
2pache Spark Projects: -ini Projects ro7ect 5 List the itemsro7ect 5 Sorting of (ecordsro7ect .5 Sho' a histogram of date vs users created5 ptionally, use a rich visuali9ation liero7ect 25 repare a map of tags vs H of ?uestions in each tag and display it5Major Projects ro7ect )ovie (ecommendationro7ect %'itter A" "ntegration for t'eet Analysisro7ect . Data 14ploration Fsing Spar SQL $ 8iipedia dataset
Curriculum 2adooModule 1 ( +ntroduction to Big Data 5 2adoo-, 2adoo- cos"stem, Ma- )educe and 2D&S •
6hat is Big Data7
&actors constituting Big Data
2adoo- and its cos"stem
Ma- )educe 8Conce-ts of Ma-, )educe, 9rdering, Concurrenc", Shu:e, )educing, Concurrenc"
2adoo- Distributed &ile S"stem ;2D&S< Conce-ts and its +m-ortance
Dee- Di*e in Ma- )educe ( .ecution &ramework, =artitioner, Combiner, Data 4"-es, %e" -airs
2D&S Dee- Di*e ( Architecture, Data )e-lication, Name Node, Data Node, Data &low
=arallel Co-"ing with D+S4C=, 2adoo- Archi*es
Assignment ( 1 Module > ( 2ands8on .ercises •
+nstalling 2adoo- in =seudo Distributed Mode, ?nderstanding +m-ortant con0guration 0les, their =ro-erties and Demon 4hreads
Accessing 2D&S from Command Line
Ma- )educe ( Basic .ercises
?nderstanding Big Data 2adoo- cos"stem
+ntroduction to Soo-, use cases and +nstallation
+ntroduction to 2i*e, use cases and +nstallation
+ntroduction to =ig, use cases and +nstallation
+ntroduction to 9o/ie, use cases and +nstallation
+ntroduction to &lume, use cases and +nstallation
+ntroduction to @arn
Assignment 8> and Mini =ro#ect ( +m-orting M"sl Data using Soo- and Quer"ing it using 2i*e
Module ( Dee- Di*e in Ma- )educe! 1!
Ma--er 5 )educer
)elation between in-ut s-lits and 2D&S blocks!
Ma- reduce #ob submission ow of in-ut s-lits!
2ow Ma--er and Combiner 6orks
Ma--er and Combiners 6ork
Shu:e 5 Sort =hase,Combiner 5 =artitioner!
Ma- )educe in detail!
Com-arison bw @A)N and M)1
Ma-)educe #ob .ecution!
Ma-)educe Combiner!
Ma-reduce =artitioner!
shu:e 5 Sort =hase!
3ob Scheduler Ma- reduce #ob submission ow
3ob launch -rocess ;3ob<
3ob launch =rocess ;task<
3ob launch -rocess ;4ask tracker<
3ob launch -rocess ;4ask runner<
3oining 9f &ilesDatasets 3oining Data sets in Ma- )educe!
Distributed cache!
)educe 3oins
)educe 3oin G!
H! •
+n-ut &ormat Custom +n-ut &ormat!
+n*erted +nde.ing! Ma- )educe ( +n*erted +nde.ing 2adoo- A=+Js 1K!.-lanation of Ma-)educe organi/ation!
2ow ma--er -rocess with detailed e.am-le testing module!
2ow to de*elo- Ma- )educe A--lication!
writing unit test Best =ractices for de*elo-ing and writing!
Debugging Ma- )educe a--lications!
Module !1 •
=ro#ect 18 2ands on e.ercise ( end to end =oC using @arn or 2adoo- >!
)eal 6orld 4ransactions handling of Bank
Mo*ing data using Soo- to 2D&S
+ncremental u-date of data to 2D&S
)unning Ma- )educe =rogram
)unning 2i*e ueries for data anal"tics
=ro#ect >8 2ands on e.ercise ( end to end =oC using @arn or 2adoo- >!G )unning Ma- )educe Code for Mo*ie )ating and 0nding their fans and a*erage rating
Assignment 8E and F Module E ( Dee- Di*e in =ig 1!
+ntroduction to =ig
6hat +s =ig7
=igJs &eatures
=ig ?se Cases
+nteracting with =ig
Basic Data Anal"sis with =ig
=ig Latin S"nta.
Loading Data
Sim-le Data 4"-es
&ield De0nitions
Data 9ut-ut
iewing the Schema
&iltering and Sorting Data
Commonl"8?sed &unctions
2ands89n .ercise' ?sing =ig for 4L =rocessing
=rocessing Com-le. Data with =ig
Com-le.Nested Data 4"-es
+terating rou-ed Data
2ands89n .ercise' Anal"/ing Data with =ig
Multi8Dataset 9-erations with =ig 4echniues for Combining Data Sets
3oining Data Sets in =ig
Set 9-erations
S-litting Data Sets
2ands89n .ercise
.tending =ig
Macros and +m-orts
?sing 9ther Languages to =rocess Data with =ig
2ands89n .ercise' .tending =ig with Streaming and ?D&s !
=ig 3obs
Case studies of &ortune FKK com-anies which are lectronic Arts and 6almart with real data sets! Assignment ( Module F ( Dee- Di*e in 2i*e 1!
+ntroduction to 2i*e
6hat +s 2i*e7
2i*e Schema and Data Storage
Com-aring 2i*e to 4raditional Databases
2i*e *s! =ig
2i*e ?se Cases
+nteracting with 2i*e >!
)elational Data Anal"sis with 2i*e
2i*e Databases and 4ables
Basic 2i*eQL S"nta.
Data 4"-es 3oining Data Sets
Common Built8in &unctions
2ands89n .ercise' )unning 2i*e Queries on the Shell, Scri-ts, and 2ue !
2i*e Data Management
2i*e Data &ormats
Creating Databases and 2i*e8Managed 4ables
Loading Data into 2i*e
Altering Databases and 4ables
Self8Managed 4ables
Sim-lif"ing Queries with iews
Storing Quer" )esults
Controlling Access to Data
2ands89n .ercise' Data Management with 2i*e
2i*e 9-timi/ation
?nderstanding Quer" =erformance
F! •
+nde.ing Data .tending 2i*e ?ser8De0ned &unctions
2ands on .ercises ( =la"ing with huge data and Quer"ing e.tensi*el"!
?ser de0ned &unctions, 9-timi/ing Queries, 4i-s and 4ricks for -erformance tuning
Assignment ( G Module ( +m-ala 1!
+ntroduction to +m-ala
6hat is +m-ala7
2ow +m-ala Diers from 2i*e and =ig
2ow +m-ala Diers from )elational Databases
Limitations and &uture Directions
?sing the +m-ala Shell
Choosing the Best ;2i*e, =ig, +m-ala<
Modeling and Managing Data with +m-ala and 2i*e
Data Storage 9*er*iew
Creating Databases and 4ables
Loading Data into 4ables
+m-ala Metadata Caching
Data =artitioning
=artitioning 9*er*iew
=artitioning in +m-ala and 2i*e
Module G ( ;A)9
Selecting a &ile &ormat
2adoo- 4ool Su--ort for &ile &ormats
A*ro Schemas
?sing A*ro with 2i*e and Soo-
A*ro Schema *olution
Module H ( +ntroduction to 2base architecture •
6hat is 2base
6here does it 0ts
6hat is N9SQL
Assignment 8H A-ache S-ark Module I ( 6h" S-ark7 .-lain S-ark and 2adoo- Distributed &ile S"stem •
6hat is S-ark
Com-arison with 2adoo-
Com-onents of S-ark
Module 1K ( S-ark Com-onents, Common S-ark Algorithms8+terati*e Algorithms, ra-h Anal"sis, Machine Learning •
A-ache S-ark8 +ntroduction, Consistenc", A*ailabilit", =artition
?ni0ed Stack S-ark
S-ark Com-onents
Com-arison with 2adoo- ( Scalding e .am-le, mahout, storm, gra-h
Module 11 ( )unning S-ark on a Cluster, 6riting S-ark A--lications using ="thon, 3a*a, Scala •
.-lain -"thon e.am-le
Show installing a s-ark
.-lain dri*er -rogram
.-laining s-ark conte.t with e.am-le
De0ne weakl" t"-ed *ariable
Combine scala and #a*a seamlessl"!
.-lain concurrenc" and distribution!
.-lain what is trait!
.-lain higher order function with e.am-le!
De0ne 9&+ scheduler!
Ad*antages of S-ark
.am-le of Lamda using s-ark
.-lain Ma-reduce with e.am-le
Module 1> ( 2adoo- Cluster Setu- and )unning Ma- )educe 3obs •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
)unning Ma- )educe 3obs on Cluster
Module 1 ( Ma#or =ro#ect ( =utting it all together and Connecting Dots •
=utting it all together and Connecting Dots
6orking with Large data sets, Ste-s in*ol*ed in anal"/ing large data
Assignment ( I, 1K Module 1E ( Ad*ance Ma-reduce •
Del*ing Dee-er +nto 4he 2adoo- A=+
More Ad*anced Ma- )educe =rogramming, 3oining Data Sets in Ma- )educe
ra-h Mani-ulation in 2adoo-
Assignment ( 11, 1> Module 1F ( 4L Connecti*it" with 2adoo- cos"stem •
2ow 4L tools work in Big data +ndustr"
Connecting to 2D&S from 4L tool and mo*ing data from Local s"stem to 2D&S
Mo*ing Data from DBMS to 2D&S
6orking with 2i*e with 4L 4ool
Creating Ma- )educe #ob in 4L tool
nd to nd 4L =oC showing 2adoo- integration with 4L tool!
Module 1 ( 2adoo- Cluster Con0guration •
2adoo- con0guration o*er*iew and im-ortant con0guration 0le
Con0guration -arameters and *alues
2D&S -arameters Ma-)educe -arameters
2adoo- en*ironment setu-
+ncludeJ and .cludeJ con0guration 0les Lab' Ma-)educe =erformance 4uning
Module 1G ( 2adoo- Administration and Maintenance •
NamenodeDatanode director" structures and 0les
&ile s"stem image and dit log
4he Check-oint =rocedure
Namenode failure and reco*er" -rocedure
Safe Mode
Metadata and Data backu-
=otential -roblems and solutions what to look f or
Adding and remo*ing nodes Lab' Ma-)educe &ile s"stem )eco*er"
Module 1H ( 2adoo- Monitoring and 4roubleshooting •
Best -ractices of monitoring a 2adoo- cluster
?sing logs and stack traces for monitoring and troubleshooting
?sing o-en8source tools to monitor 2adoo- cluster
Module 1I ( 3ob Scheduling •
2ow to schedule 2adoo- 3obs on the same cluster
Default 2adoo- &+&9 Schedule
&air Scheduler and its con0guration
Module >K ( 2adoo- Multi Node Cluster Setu- and )unning Ma- )educe 3obs on Ama/on c> •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
)unning Ma- )educe 3obs on Cluster
Module >1 ( O99%=) •
O99%=) +ntroduction
O99%=) use cases
O99%=) Ser*ices
O99%=) data Model
Onodes and its t"-es
Onodes o-erations
Onodes watches
Onodes reads and writes
Consistenc" uarantees
Cluster management
Leader lection
Distributed .clusi*e Lock
+m-ortant -oints
Module >> ( Ad*ance 9o/ie •
6h" 9o/ie7
+nstalling 9o/ie
)unning an e.am-le
9o/ie8 workow engine
.am-le M) action
6ord count e.am-le
6orkow a--lication
6orkow submission
6orkow state transitions
9o/ie #ob -rocessing
9o/ie8 2AD99= securit"
6h" 9o/ie securit"7
3ob submission to hadoo-
Multi tenanc" and scalabilit"
4ime line of 9o/ie #ob
La"ers of abstraction
?se Case 1' time triggers
?se Case >' data and time triggers
?se Case ' rolling window
Module > ( Ad*ance &lume •
A-ache &lume
Big data ecos"stem
=h"sicall" distributed Data sources
Changing structure of Data
Closer look
Anatom" of &lume
Core conce-ts
Channel selector
Sink -rocessor
Data ingest
Agent -i-eline
4ransactional data e.change
)outing and re-licating
6h" channels7
?se case8 Log aggregation
Adding ume agent
2andling a ser*er farm
Data *olume -er agent
.am-le describing a single node ume de-lo"ment
Module >E ( Ad*ance 2? •
2? introduction
2? ecos"stem
6hat is 2?7
2? real world *iew
Ad*antages of 2?
2ow to u-load data in &ile Browser7
iew the content
+ntegrating users
+ntegrating 2D&S
&undamentals of 2? &)9N4ND
Module >F ( Ad*ance +m-ala •
+M=ALA 9*er*iew' oals
?ser *iew of +m-ala' 9*er*iew
?ser *iew of +m-ala' SQL
?ser *iew of +m-ala' A-ache 2Base
+m-ala architecture
+m-ala state store
+m-ala catalogue ser*ice
Quer" e.ecution -hases
Com-aring +m-ala to 2i*e
4esting Module > ( 2adoo- Stack +ntegration 4esting •
6h" 2adoo- testing is im-ortant
?nit testing
+ntegration testing
=erformance testing
Nightl" QA test
Benchmark and end to end tests
&unctional testing
)elease certi0cation testing
Securit" testing
Scalabilit" 4esting
Commissioning and Decommissioning of Data Nodes 4esting
)eliabilit" testing
)elease testing
Module >G ( )oles and )es-onsibilities of 2adoo- 4esting •
?nderstanding the )euirement, -re-aration of the 4esting stimation, 4est Cases, 4est Data, 4est bed creation, 4est .ecution, Defect )e-orting, Defect )etest, Dail" Status re-ort deli*er", 4est com-letion! 4L testing at e*er" stage ;2D&S, 2+, 2BAS< while loading the in-ut ;logs0lesrecords etc< using soo-ume which includes but not limited to data *eri0cation, )econciliation!
?ser Authori/ation and Authentication testing ;rou-s, ?sers, =ri*ileges etc<
)e-ort defects to the de*elo-ment team or manager and dri*ing them to closure!
Consolidate all the defects and create defect re-orts!
alidating new feature and issues in Core 2adoo-!
Module >H ( &ramework called M) ?nit for 4esting of Ma-8)educe =rograms •
)e-ort defects to the de*elo-ment team or manager and dri*ing them to closure!
Consolidate all the defects and create defect re-orts!
alidating new feature and issues in Core 2adoo-
)es-onsible for creating a testing &ramework called M) ?nit for testing of Ma-8)educe -rograms!
Module >I ( ?nit 4esting •
Automation testing using the 99O+!
Data *alidation using the uer" surge tool!
Module K ( 4est .ecution of 2adoo- Pcustomi/ed •
4est -lan for 2D&S u-grade
4est automation and result
Module 1 ( 4est =lan Strateg" 4est Cases of 2adoo- 4esting •
2ow to test install and con0gure
Module > ( 2igh A*ailabilit" &ederation, @arn and Securit" Module ( 3ob and Certi0cation Su--ort •
Ma#or =ro#ect on Big Data and 2adoo-, 2adoo- De*elo-ment, Cloudera Certi0cation 4i-s and uidance and Mock +nter*iew =re-aration, =ractical De*elo-ment 4i-s and 4echniues, certi0cation -re-aration
=ro#ect 6ork 1!
=ro#ect ( 6orking with Ma- )educe, 2i*e, Soo-
=roblem Statement ( +t describes that how to im-ort m"sl data using soo- and uer"ing it using hi*e and also describes that how to run the word count ma-reduce #ob!
>! =ro#ect ( 6ork on Mo*ie lens data for 0nding to- records
Data ( Mo*ie Lens dataset
=roblem Statement ( +t includes' •
6rite a Ma-)educe -rogram to 0nd the to- 1K mo*ies from the u!data 0le
Create the same to- 1K mo*ies using =+ b" loading u!data into -ig
Create the same to- 1K mo*ies using 2+ b" loading u!data into 2+
! =ro#ect ( 2adoo- @arn =ro#ect ( nd to nd =oC
=roblem Statement ( +t includes' •
+m-ort Mo*ie data
A--end the data
2ow to use soo- commands to bring the data into the hdfs
nd to nd ow of transaction data
2ow to -rocess the real word data or huge amount of data using ma- reduce -rogram in terms of mo*ie etc!
E! =ro#ect ( =artitioning 4ables
=roblem Statement ( +t describes about the -arting and 2ow to -erform -ortioning! +t includes' •
Manual =artitioning
D"namic =artitioning
F! =ro#ect ( Sales Commission
Data ( Sales
=roblem Statement ( +n this we calculate the commission according to the sales!
! =ro#ect ( Connecting =entaho with 2adoo- co8s"stem
=roblem Statement ( +t includes' •
Quick 9*er*iew of 4L and B+
Con0guring =entaho to work with 2adoo- Distribution
Loading data into 2adoo- cluster
4ransforming data into 2adoo- cluster
.tracting data from 2adoo- Cluster
G! =ro#ect ( Multinode Cluster Setu-
=roblem Statement ( +t includes following actions' •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
)unning Ma- )educe 3obs on Cluster
H! =ro#ect ( 2adoo- 4esting using M)
=roblem Statement ( +t describes that how to test ma- reduce codes with M) unit!
I! =ro#ect ( 2adoo- 6eblog Anal"tics
Data (
=roblem Statement ( 4he goal is to enable the -artici-ants to ha*e a feel of the actual data sets in a -roduction en*ironment and how to load the data into a 2adoo- cluster using *arious techniues! 9nce data is loaded, the ne.t goal is to -erform basic anal"tics on this data!
) =rogramming Module 1 ( 2ow ) 6orks •
Data mining ?sing Statistical -ackages
A &ew conce-ts Before Starting
Module > -art 1 ( 6hat is )8=ackages •
Assigning alues 4o ariables
ector Creation
Module > -art > ( 6hat is Sorting •
enerating )e-eats
6hat is re- &unction
enerating &actor Le*els
Sorting =rocess
Module > -art ( 4rans-ose &unction •
Stack &unction ?sed
Module -art 1 ( &unctions 5 )eading Data from .ternal &iles •
Merge &unction
Strs-lit &unction
Matri. Mani-ulation
)ow Sums
Module -art > ( enerating =lots and =ie Charts •
Line =lots
Bar =lots
Bar =lots &or =o-ulation
=ie Chart Com-onents
Module E -art 1 ( Anal"sis of arianc" ;AN9A< •
9ne 6a" Anal"sis of ariance
4wo 6a" Anal"sis of ariance
Module E -art > ( 6hat is Cluster Anal"sis •
%8Means Clustering
Cluster Algorithm 6orking
Module F -art 1 ( Association )ule Mining Anit" Anal"sis •
Association )ule Mining Anit" Anal"sis
Module F -art > ( 4wo ariable )elationShi-s •
Linear )egression
De-endent And +nde-endent ariables
Scatter =lots
Module =art 1 ( Database connecti*it" 5 Logistic )egression •
Logistic )egression
.am-les of Logistic )egression
Logistic )egression in )
Module =art > ( )9C Cur*e in )
Confusion Matri.
)9C Cur*e in )
Sensiti*it" 5 S-eci0cit"
Data Base Connecti*it" )9DBC
)eading Data to 9DBC 4ables
&unction ;Mean<
.am-les 9f &unction
-odule & – +ntegrating ) with 2adoo•
Methods to integrate two -o-ular o-en source softwares for Big Data anal"tics' ) and 2adoo-
+ntegrating ) with 2adoo- using )2adoo- and )M) -ackage
.-loring )2+= ;) 2adoo- +ntegrated =rogramming n*ironment<
6riting Ma-)educe 3obs in ) and e.ecuting them on 2adoo-
=ro#ect ( )estaurant )e*enue =rediction
Data ( )e*enue Data set
=roblem Statement ( +t -redicts the annual restaurant sales based on the ob#ecti*e measurements! +t uses followi ng data 0elds' •
9-ening Date
4"-e of the Cit"
4"-e of the )estaurant
4hree categories of 9bfuscated Data
)e*enue +t also includes'
Data 9*er*iew
Data &ields
*aluation using )MS
&eature ngineering Selection
Mahout Module 1 (Mahout 9*er*iew •
Classi0cation and )ecommendation
Clustering in Mahout
=attern Mining
?nderstanding machine Learning
?sing Model diagram to decide the a--roach
Data ow
Su-er*ised and ?nsu-er*ised learning
Module > ( Mahout )ecommendations •
Conce-t of )ecommendation
)ecommendations b" 8commerce site
Com-arison between ?ser )ecommendations and +tem recommendation
De0ne recommenders and Classi0ers
=rocess of Collaborati*e &iltering
.-laining =earson coecient algorithm
uclidean distance measure
+m-lementing a recommender using ma- reduce
Module ( Clustering Session 1 •
De0ning Clustering
?ser8to8user similarit"
Clustering +llustration
uclidean distance measure
Distance measure *ector
?nderstanding the -rocess of Clustering
ectori/ing documents8?nstructured data
Module ( Clustering Session > •
Document clustering
Seuence8to8s-arse ?tilit"
%8Mean Clustering
Module ( Clustering Session Module E ( Classi0cation Session 1 •
=redictor and 4arget *ariable
Classi0able Data
%e" Challenges in Classi0cation algorithm
ectori/ing Continuous data
Classi0cation .am-les
Logic )egression and its e.am-les
Module E ( Clustering and Classi0cation Session > •
Clustering =rocess
4ransaction Clustering
Dierent techniues of ectori/ation
Distance measure
Clustering algorithm8%8MAN
Clustering A--lication81
Clustering A--lication8>
Sentiment Anal"/er
Module F ( =attern Mining •
=earson Coecient
Collaborati*e &iltering =rocess
Collaborati*e &iltering
Similarit" Algorithms
=earson Correlation
uclidean Distance Measure 8&reuent =attern 5 Association rules
&reuent =attern rowth
Session ( Course Summar"
Data Science Module 1 ( etting started with Data Science and )ecommender S"stems •
Data Science 9*er*iew
)easons to use Data Science
=ro#ect Lifec"cle
Data Acuirement
*aluation of +n-ut Data
4ransforming Data
Statistical and anal"tical methods to work with data
Machine Learning basics
+ntroduction to )ecommender s"stems
A-ache Mahout 9*er*iew
Module > ( )easons to ?se, =ro#ect Lifec"cle •
6hat is Data Science7
6hat %ind of =roblems can "ou sol*e7
Data Science =ro#ect Life C"cle
Data Science8Basic =rinci-les
Data Acuisition
Data Collection
?nderstanding Data8 Attributes in a Data, Dierent t"-es of ariables
Build the ariable t"-e 2ierarch"
4wo Dimensional =roblem
Co8relation bw the ariables8 e.-lain using =aint 4ool
9utliers, 9utlier 4reatment
Bo.-lot, 2ow to Draw a Bo.-lot
Module ( Acuiring Data •
Discussion on Bo.-lot8 also .-lain
.am-le to understand *ariable Distributions
6hat is =ercentile7 ( .am-le using )studio tool
2ow do we identif" outliers7
2ow do we handle outliers7
9utlier 4reatment' ?sing Ca--ing&looring eneral Method
Distribution8 6hat is Normal Distribution7
6h" Normal Distribution is so -o-ular7
?niform Distribution
Skewed Distribution
Module E ( Machine Learning in Data Science •
Discussion about Bo.-lot and 9utlier
oal' +ncrease =ro0ts of a Store
Areas of increasing the ecienc"
Data )euest
Business =roblem' 4o ma.imi/e sho- =ro0ts
6hat are +nterlinked *ariables
6hat is Strateg"
+nteraction bw the ariables
?ni*ariate anal"sis
Multi*ariate anal"sis
Bi*ariate anal"sis
)elation bw ariables
Standardi/e ariables
6hat is 2"-othesis7
+nter-ret the Correlation
Negati*e Correlation
Machine Learning
Module F (Statistical and anal"tical methods dealing with data, +m-lementation of )ecommenders using A-ache Mahout and 4ransforming Data •
Correlation bw Nominal ariables
Contingenc" 4able
6hat is .-ected alue7
6hat is Mean7
2ow .-ected alue is dier from Mean
.-eriment ( Controlled .-eriment, ?ncontrolled .-eriment
Degree of &reedom
De-endenc" bw Nominal ariable 5 Continuous ariable
Linear )egression
.tra-olation and +nter-olation
?ni*ariate Anal"sis for Linear )egression
Building Model for Linear )egression
=attern of Data means7
Data =rocessing 9-eration
6hat is sam-ling7
Sam-ling Distribution
Strati0ed Sam-ling 4echniue
Dis-ro-ortionate Sam-ling 4echniue
Balanced Allocation8-art of Dis-ro-ortionate Sam-ling
S"stematic Sam-ling
Cluster Sam-ling
> angels of Data Science8Statistical Learning, Machine Learning
Module ( 4esting and Assessment, =roduction De-lo"ment and More •
Multi *ariable anal"sis
linear regration
Sim-le linear regration
2"-othesis testing
S-eculation *s! claim;Quer"<
Ste- to test "our h"-othesis
-erformance measure
enerate null h"-othesis
alternati*e h"-othesis
4esting the h"-othesis
4hreshold *alue
2"-othesis testing e.-lanation b" e.am-le
Null 2"-othesis
Alternati*e 2"-othesis
2istogram of mean *alue
)e*isit C2+8SQ?A) inde-endence test
Correlation between Nominal ariable
Module G ( Business Algorithms, Sim-le a--roaches to =rediction, Building model, Model de-lo"ment •
Machine Learning
+m-ortance of Algorithms
Su-er*ised and ?nsu-er*ised Learning
arious Algorithms on Business
Sim-le a--roaches to =rediction
=redict Algorithms
=o-ulation data
Dis-ro-ortionate Sam-ling
Ste-s in Model Building
Sam-le the data
6hat is %7
4raining Data
4est Data
alidation data
Model Building
&ind the accurac"
De-lo" the model
Linear regression
Module H ( etting started with Segmentation of =rediction and Anal"sis •
Cluster and Clustering with .am-le
Data =oints, rou-ing Data =oints
Manual =ro0ling
2ori/ontal 5 ertical Slicing
Clustering Algorithm
Criteria for take into Consideration before doing Clustering
ra-hical .am-le
Clustering 5 Classi0cation' .clusi*e Clustering, 9*erla--ing Cl ustering, 2ierarch" Clustering
Sim-le A--roaches to =rediction
Dierent t"-es of Distances' 1!Manhattan, >!uclidean, !Consine Similarit"
Clustering Algorithm in Mahout
=robabilistic Clustering
=attern Learning
Nearest Neighbor =rediction
Nearest Neighbor Anal"sis
Module I ( +ntegration of ) and 2adoo•
) introduction
2ow ) is t"-icall" used
&eatures of )
+ntroduction to Big data
6a"s to connect with ) and 2adoo-
Case Stud"
Ste-s for +nstalling )+M=ALA
2ow to create +M=ALA -ackages
=ro#ects =ro#ect 18?nderstanding Cold Start =roblem in Data Science •
Algorithms for )ecommender
6a"s of )ecommendation
4"-es of )ecommendation 8Collaborati*e &iltering Based )ecommendation, Content8Based )ecommendation Cold Start =roblem
=ro#ect >8)ecommendation for Mo*ie, Summar" •
)ecommendation for mo*ie
4wo 4"-es of =redictions ( )ating =rediction, +tem =rediction
+m-ortant A--roaches' Memor" Based and Model Based
%nowing ?ser Based Methods in %8Nearest Neighbor
?nderstanding +tem Based Method
Matri. &actori/ation
Decom-osition of Singular alue
Data Science =ro#ect discussion
Collaboration &iltering
Business ariables 9*er*iew
Data Science Assignment •
)eal8time enter-rise -roblem
?se of *arious datasets to sol*e this -roblem
?se of ariables for =roblem )esolution
Building strateg" to sol*e this -roblem with the a*ailable data
Descri-ti*e Statistics
S=4 Module 1 ( +nformation of Statistics •
6hat is statistics
2ow is this useful
6hat is this course for
Module > ( Data Con*ersion •
Con*erting data into useful information
Collecting the data
?nderstand the data
&inding useful information in the data
+nter-reting the data
isuali/ing the data
Module ( 4erms of Statistics •
Descri-ti*e statistics
Let us understand some terms in statistics
ModuleE ( =lots •
Dot =lots
Bo. and whisker -lots
9utlier detection from bo. -lots and Bo. and whisker -lots
Module F ( Statistics 5 =robabilit" •
6hat is -robabilit"
Set 5 rules of -robabilit"
Ba"es 4heorem
Module ( Distributions •
=robabilit" Distributions
&ew .am-les
Student 48 Distribution
Sam-ling Distribution
Student t8 Distribution
=oison distribution
ModuleG ( Sam-ling •
Strati0ed Sam-ling
=ro-ortionate Sam-ling
S"stematic Sam-ling
= ( alue
Strati0ed Sam-ling
Module H ( 4ables 5 Anal"sis •
Cross 4ables
Bi*ariate Anal"sis
Multi *ariate Anal"sis
De-endence and +nde-endence tests ; Chi8Suare <
Anal"sis of ariance
Correlation between Nominal *ariables
=ro#ect ( Data Anal"sis =ro#ect
Data ( Sales
=roblem Statement ( +t includes the following actions'
?nderstand the business solutions
Discussion with the warehouse team
Data Collection 5 Storage
Data Cleaning
Build a 2"-othesis 4ree around the business -roblem
=roduce the 0nal result!
A-ache Solr -odule 1. he 4undamentals •
About Solr
+nstalling and running Solr
Adding content to Solr
)eading a Solr ML res-onse
Changing -arameters in the ?)L
?sing the browse interface -odule . Searchin*
Sorting results
Quer" -arsers
More ueries
2ardwiring reuest -arameters
Adding 0elds to default search
)esult grou-ing -odule ". 5nde6in*
Adding "our own content to Solr
Deleting data from solr
Building a bookstore search
Adding book data
.-loring the book data
Dedu-e u-date -rocessor -odule #. ,pdatin* 0our schema
Adding 0elds to the schema
Anal"/ing te.t -odule $. )eleance
&ield weighting
=hrase ueries
&unction ueries
&u//ier search
Sounds8like -odule %. 76tended features
S-ell checking
Multilanguage -odule &. -ulticore
Adding more kinds of data -odule '. SolrCloud
2ow SolrCloud works
Commit strategies
Managing Solr con0g 0les
=ro#ect ( &unction Queries
=roblem Statement ( +t describes that how to use function ueries in Solr, su--ose an inde. store the dimensions in meters ., ", / of some h"-othetical bo.es with arbitrar" names stored in 0eld bo.name! Su--ose we want to search for bo. matching name 0ndbo. but ranked according to *olumes of bo.es!
S-lunk Module 1 ( Basic Conce-ts of S-lunk De*elo-ment •
S-lunk de*elo-ment conce-ts
)oles and res-onsibilities of S-lunk De*elo-er
Module > ( Sa*ing and Scheduling Searches •
.-orting search results
Sa*ing and sharing search results
Sa*ing searches
Search scheduling
Module ( Creating Alerts •
Describing alerts
Alert Creation
iew 0red alerts
Module E ( 4ags and *ent 4"-es •
?nderstanding tags
Creating tags and using them in a search
De0ning e*ent t"-es and their usefulness
Creating and using e*ent t"-es in a search
Module F ( Search Commands
)e*iewing search commands and -erforming general search -ractices
.amine the anatom" of a search
?sing *arious commands to -erform searches'0elds, table, rename, re.5ere., multi-l"
Module ( )e-orting Commands •
?sing following commands and their functions' 1! to>! rare ! stats E! addcoltotals F! addtotals
Module G ( isuali/ations •
.-lore the a*ailable *isuali/ations
Create Charts and timecharts
9mit null *alues and format results
Module H ( Anal"/ing, Calculating and &ormatting )esults •
?sing e*al command
=erform calculations
alue Con*ersion
)ound *alues
&ormat *alues
Conditional statements
&iltering calculated results
Module I ( Correlating *ents •
9*er*iew of 4ransactions
Search 4ransactions
Module 1K ( nriching Data with Looku-s •
6hat are looku-s7
Looku- 0le e.am-le
Creating a looku- table
De0ning a looku-
Con0guring an automatic looku-
?sing the looku- in searches and re-orts
Module 11 ( Creating )e-orts and Dashboards •
Creating re-orts and charts
Creating dashboards and adding re-orts
Module 1> ( etting started with =arsing •
Data =re*iew and =arsing =hase
)aw Data Mani-ulation
.traction of &ields
=ro#ect •
4he S-lunk =ro#ect, after 0nishing this training course, will let "ou create a re-ort and dashboard with the te.t 0le ha*ing em-lo"ee details! @ou will -erform *arious row o-erations to fetch data as -er "our reuirements and use im-ortant S-lunk commands on the 0le to e.tract certain 0elds! 9ther signi0cant as-ects of this -ro#ect is editing the e*ent, adding tags, searching e*ent with tag names and sa*ing tag search!
S-lunk Admin Module 18 Sim-le S-lunk n*ironment •
+nstalling S-lunk
License Management
Data +n-uts
A-- management
Module >8 Basic =roduction n*ironment •
+ntroduction to S-lunk Con0guration &iles
?ni*ersal &orwarder
&orwarder Management
Module ( arious Data +n-uts •
?nderstanding Monitor +n-uts
6hat are Network +n-uts7
De0ne Modular and Scri-ted +n-uts
.-laining 6indows +n-uts
6hat are &ine8tuning +n-uts7
Module E ( +nde. and ?ser Management •
Conce-t of +nde.ing in S-lunk
Maintenance and 9-timi/ation of +nde.es
?sers' 4heir )oles and Authentication
Module F ( etting started with =arsing •
Data =re*iew and =arsing =hase
)aw Data Mani-ulation
.traction of &ields
Module ( Search Scaling and Monitoring •
=erforming Distributed Search
Search =erformance 4uning
?nderstanding .ecution issues in large scale de-lo"ment
Distributed Management Console
=ro#ect ( &ield .traction
=roblem Statement ( +t includes' •
About &ield .traction
&ield .tractor ?tilit"
&ield .traction -age in S-lunk 6eb
Con0gure 0eld e.traction in con0guration 0les etc!
A-ache Storm Module 1 ( ?nderstanding Architecture of Storm •
Ba"esian Law
2adoo- Distributed Com-uting
Big Data features
Legac" Architecture of )eal 4ime S"stem
Storm *s! 2adoo-
Logical D"namic and Com-onents in Storm
Storm 4o-olog"
.ecution Com-onents in Storm
Stream rou-ing
Bolt8normali/ation bolt
Module > ( +nstallation of A-ache storm •
+nstalling A-ache Storm
Module ( rou-ing •
Dierent t"-es of rou-ing
)eliable and unreliable messaging
&etching data ( Direct connection and n8ueued message
Bolt Lifec"cle
Module E ( 9*er*iew of 4rident •
4rident S-outs and its t"-es
Com-onents and +nterface of 4rident s-out
4rident &unction, &ilter 5 Aggregator
Module F ( Boot Stri--ing •
4witter Boot Stri--ing
Detailed learning on Boot Stri--ing
Conce-ts of Storm
Storm De*elo-ment n*ironment
=ro#ects )eal8time =ro#ect on Storm 4he =ro#ect Bolt Blue =rint 2Base Module 1 (2Base 9*er*iew •
etting started with 2Base
Core Conce-ts of 2Base
?nderstanding 2Base with an .am-le
Module > (Architecture of NoSQL •
6h" 2Base7
6here to use 2Base7
6hat is NoSQL7
Module ( 2Base Data Modeling •
2D&S *s!2Base
2Base ?se Cases
Data Modeling 2Base
Module E (2Base Cluster Com-onents •
2Base Architecture
Main com-onents of 2Base Cluster
Module F ( 2Base A=+ and Ad*anced 9-erations •
2Base Shell
2Base A=+
=rimar" 9-erations
Ad*anced 9-erations
Module ( +ntegration of 2i*e with 2Base •
Create a 4able and +nsert Data into it
+ntegration of 2i*e with 2Base
Load ?tilit"
Module G ( &ile loading with both load ?tilit" •
=utting &older to M
&ile loading with both load ?tilit"
=ro#ect ( +ntegrate 2i*e and 3a*a with 2Base
=roblem Statement ( 4his -ro#ect describes that how to integrate hi*e and #a*a with 2Base! +t includes following actions' •
+nstallation of 2Base
Creation of 4able
3a*a =rogram to create the table in 2Base
Managing the 2Base 4able with 2i*e
Bulk +m-ort etc!
Cassandra Module 18Ad*antages and ?sage of Cassandra •
Brief +ntroduction of the course
Ad*antages and ?sage of Cassandra
Module >8CA= 4heorem and No SQL DataBase
6h" No SQL DataBase
)e-lication in )DBMS
%e" Challenges with )DBMS
No SQL;Not onl" SQL<
No SQL Categor"
Ad*antage 5Limitation
%e" Characteristics of No SQL Data Base
CA= 4heorem
Module 8Cassandra fundamentals, Data model, +nstallation and setu•
6hat is Cassandra7
Non relational
%e" de-lo"ment conce-t
6hat is column oriented database
Data Model ( column
6hat is column famil"
Module E8Ste-s in Con0guration •
4oken calculation
Con0guration o*er*iew
Node tool
.-iring column
Module F8Summari/ation, node tool commands, cluster, +nde.es, Cassandra 5 Ma-reduce, +nstalling 9-s8center •
Dierence between )elational modeling 5 Cassandra modeling
Ste-s in Cassandra modeling
4ime series modeling in Cassandra
Column famil"
Data modeling in Cassandra
Column famil" *s! Su-er column famil"
Counter column famil"
=artitioners strategies
ossi- -rotocols
)ead o-eration
Module 8Multi Cluster setu•
Node settings
Setu- of Multinode cluster
)ow cache and %e" cache
)ead o-eration
S"stem ke"s-ace
Commands o*er*iew
Column famil"
Module G84hriftA)93S9N2ector Client
2ector client
2ow to write a 3AA code
2ector tag
Module H8Datasta. installation -art,T Secondar" inde. •
Node tool commands
Management of Cassandra
Secondar" inde.
Cassandra 5 ma- reduce
Datasta. installation -art
Module I8Cassandra A=+ and Summari/ation and 4hrift •
+nternals of connection -ool
Client connecti*it" to cassandra
2ector client ke" features
2ector client ke" conce-ts
3a*a code
MongoDB Module 1 ( etting started with NoSQL, MongoDB and their +nstallation •
Database t"-e descri-tion
6hat is NoSQL Database7
NoSQL Database s 4"-es
Challenges with )DBMS
6h" we reuire NoSQL data7
6hat is M9N9DB
3S9NBS9N +ntroduction
3S9N Data 4"-es
.am-le of 3S9N
+nstallation of M9N9DB
Module > ( =art 1 ( NoSQL and its iM-ortance •
Database 4"-e
4"-e of N9SQL Database
Challenges with )DBMS
6h" N9SQL
AC+D -ro-ert"
CA= 4heorem
Base -ro-ert"
+ntroduction to 3son Bson
3son Data t"-es
Database collection 5 document
MongoDB use cases
3uurnaled &s"nced
)e-ica Acknowledged
Module > ( =art > ( C)?D 9-erations •
MongoDB crud 4utorial
+nstallation )ent
used --t
#son its s"nta.
C)?D +ntroduction,
)ead and 6rite 9-erations
6rite 9-eration Concern Le*els
MongoDB C)?D 4utorials
MongoDB C)?D )eference
2ands on with C)?D 9-erations
Module ( =art 1 ( ?nderstanding Schema Design, Backu- strategies, Data Modeling and Monitoring •
Data Modeling in MongoDB
)DBMS *s! Data models
Data Modeling tools
Data modeling e.am-le 5 -atterns
Model 4) structure
9-erational strategies
Backu- strategies
Monitoring Commands
Monitoring of -erformance issues
)un time con0guration
.-ort 5 im-ort of data
)elationshi- between Document
Model S-eci0c A--lication Conte.ts
Data Model )eference
2ands on with MongoDB Data Modeling
Module ( =art > ( Data Administration and Management •
Data Management
+ntroduction to re-lica
lection of new -rimar"
)e-lica set
4"-e of )e-lica
2idden )e-lica
Arbiter )e-lica
Conce-ts around )e-lication
Setting u- re-licated cluster
Setting u- Sharded Cluster
Sharding Database, Collections
2ands on .ercise
Module E ( +nde.es and Aggregation •
+ntroduction to +nde.es
Conce-ts around +nde.es
4"-e of +nde.es
+nde. =ro-ert"
+ntroduction to Aggregation
4"-e of Aggregation
?se cases of Aggregation
2ands on .ercise
Module F ( Securit" in MongoDB •
Securit" )isks to Databases
MongoDB Securit" A--roach
MongoDB Securit" Conce-t
Access Control
+ntegration with MongoDB with )obomongo
+ntegration with MongoDB with 3a*a
Module ( MongoDB +ntegration with 3as-ersoft, Load and Manage ?nstructured Data ;ideos, +mages, Logs, )esumes etc!< •
+ntegration with MongoDB with 3as-ersoft
Additional Conce-t ;rid&S U mongo 0les<
Loading and Managing ?nstructured Data ;ideos, +mages, Logs, )esumes etc!<
=ro#ect ( 3a*a MongoDB +ntegration
=roblem Statement ( +t creates a table to insert the *ideo 0le using the #a*a -rogram! &or this it -erforms following actions' •
+nstallation of 3a*a
Adding MongoDB 3a*a Connector etc!
A-ache S-ark Module 186h" S-ark7 .-lain S-ark and 2adoo- Distributed &ile S"stem •
6hat is S-ark
Com-arison with 2adoo-
Com-onents of S-ark
Module >8S-ark Com-onents, Common S-ark Algorithms8+terati*e Algorithms, ra-h Anal"sis, Machine Learning
A-ache S-ark8 +ntroduction, Consistenc", A*ailabilit", =artition
?ni0ed Stack S-ark
S-ark Com-onents
Com-arison with 2adoo- ( Scalding e .am-le, mahout, storm, gra-h
Module 8)unning S-ark on a Cluster, 6riting S-ark A--lications using ="thon, 3a*a, Scala •
.-lain -"thon e.am-le
Show installing a s-ark
.-lain dri*er -rogram
.-laining s-ark conte.t with e.am-le
De0ne weakl" t"-ed *ariable
Combine scala and #a*a seamlessl"!
.-lain concurrenc" and distribution!
.-lain what is trait!
.-lain higher order function with e.am-le!
De0ne 9&+ scheduler!
Ad*antages of S-ark
.am-le of Lamda using s-ark
.-lain Ma-reduce with e.am-le
Module E8)DD and its o-eration •
Dierence between )+SC and C+SC
De0ne A-ache Mesos
Cartesian -roduct between two )DD
De0ne count
De0ne &ilter
De0ne &old
De0ne A=+ 9-erations
De0ne &actors
Module F8S-ark, 2adoo-, and the nter-rise Data Centre, Common S-ark Algorithms •
2ow hadoo- cluster is dierent from s-ark
De0ne writing data
.-lain seuence 0le and its usefulness
De0ne -rotocol buers
De0ne te.t 0le, CS, 9b#ect &iles and &ile S"stem
De0ne s-arse metrics
.-lain )DD and Com-ression
.-lain data stores and its usefulness
Module 8S-ark Streaming •
De0ne lastic Search
.-lain Streaming and its usefulness
A-ache bookee-er
De0ne Dstream
De0ne ma-reduce word count
.-lain =arauet
Scala 9)M
De0ne Mlib
.-lain multi gra-hi. and its usefulness
De0ne -ro-ert" gra-h
Module G8S-ark =ersistence in S-ark •
Scala and ="thon
.am-les ( %8means
Latent Dirichlet Allocation ;LDA<
Module H8Broadcast and accumulator •
Broadcast ariables
.am-le' 3oin
Alternati*e if one table is small
Better *ersion with broadcast
2ow to create a Broadcast
Accumulators moti*ation
.am-le' 3oin
Accumulator )ules
Custom accumulators
Another common use
Creating an accumulator using s-ark conte.t ob#e ct
Module I8S-ark SQL and )DD •
S-ark SQL main ca-abilities
S-ark SQL usage diagram
S-ark SQL
+m-ortant to-ics in S-ark SQL8 Data frames
4witter language anal"sis
Module 1K89-erationsAccumulators4raits •
2ow -arallelism 4akes -lace
4he Master =arameter 3oin 9-erations .am-le
Module 118Scheduling=artitioning •
4ask Scheduling distribution
Scheduling Around A--lications
Static =artitioning
D"namic Sharing
Scheduling 6ithin An A--lication
&air Scheduling
2igh A*ailabilit" 9f S-ark Master
Standb" Masters 6ith Oookee-er
Single Node )eco*er" 6ith Local &ile S"stem
2igh 9rder &unctions
Module 1>8Ca-acit" =lanning in S-ark •
=racticals ' Creating Ma-s, 4ransformations
ca-acit" -lanning in s-ark
concurrenc" in #a*a
concurrenc" in scala
Module 18Log Anal"sis •
Arra" Buers
Com-act Buer
=rotocol Buer
Log Anal"sis 6ith S-ark
&irst Log Anal"/ers +n S-ark
Mini =ro#ects =ro#ect 1! List the items =ro#ect >! Sorting of )ecords =ro#ect ! Show a histogram of date *s users created! 9-tionall", use a ri ch *isuali/ation like =ro#ect E! =re-are a ma- of tags *s V of uestions in each tag and dis-la" it!
Ma#or =ro#ects =ro#ect 1 Mo*ie )ecommendation =ro#ect > 4witter A=+ +ntegration for tweet Anal"sis =ro#ect Data .-loration ?sing S-ark SQL ( 6iki-edia dataset
Scala Module 18+ntroduction of Scala •
Scala 9*er*iew
Module >8=attern Matching •
Ad*antages of Scala
)=L ;)ead *aluate -rint loo-<
Language &eatures
4"-e +nterface
2igher order function
=attern Matching
A--lication S-ace
Module 8.ecuting the Scala code •
?ses of scala inter-reter
.am-le of static ob#ect timer in scala
4esting of String eualit" in scala
+m-licit classes in scala with e.am-les!
)ecursion in scala
Curr"ing in scala with e.am-les!
Classes in scala
Module E8Classes conce-t in Scala •
Constructor o*erloading
Abstract classes
4"-e hierarch" in Scala
9b#ect eualit"
al and *ar methods
Module F8Case classes and -attern matching •
Sealed traits
Case classes
Constant -attern in case classes
6ild card -attern
ariable -attern
Constructor -attern
4u-le -attern
Module 8Conce-ts of traits with e.am-le •
3a*a eui*alents
Ad*antages of traits
A*oiding boiler-late code
Lineari/ation of traits
Modelling a real world e.am-le
Module G8Scala #a*a +ntero-erabilit" •
2ow traits are im-lemented in scala and #a*a
2ow e.tending multi-le traits is handled
Module H8Scala collections •
Classi0cation of scala collections
+terator and iterable
List seuence e.am-le in scala
Module I8Mutable collections *s! +mmutable collections •
Arra" in scala
List in scala
Dierence between list and list buer
Arra" buer
Queue in scala
Deueue in scala
Mutable ueue in scala
Stacks in scala
Sets and ma-s in scala
Module 1K8?se Case bobsrockets -ackage •
Dierent im-ort t"-es
Selecti*e im-orts
Scala test case8 scala test fun! Suite
3unit test in scala