About Course Master the skills of Big Data, NoSQL and Data Science at once and become a successful Big Data Scientist with access to 1 courses at once for a lifetime! Start "our #ourne" now$ List of Courses present in this combo pack
Hadoop Architect Training : All in 1 Combo Course: Hadoop Developer, Hadoop Analyst, Hadoop Adminis and Hadoop Testing
R Programming Training
Mahout Training
Data cience Training: !uilding Recommender ystems
tatistics and Probability Probabilit y Training Training
Apache olr Training
plun" Training
Apache torm Training
plun" Admin Training
H!ase Training
Cassandra Training
MongoD! Training
Apache par", cala Training
%e" &eatures' •
A comprehensive, comprehensive, in-depth combo of of Big Data + Data Science + No-SQL No-SQL courses courses including as many as ! niches, highly endorsed and top-paying technology courses
•
"ntensive Learning on #adoop $ #adoop Architect Architect %raining $ All in &ombo &ourse 'hich includes #adoop Developer, #adoop Analyst, #adoop Administration and #adoop %esting, %esting, ( programming %raining, )ahout %raining, Data Science %raining* Building (ecommender Systems, Statistics and robability %raining, Apache Solr %raining, %raining, Splun %raining, %raining, apache Storm %raining, %raining, Splun admin %raining, %raining, #Base %raining, %raining, &assandra %raining, )ongoDB %raining and Apache Spar and Scala %raining
•
./ hours of #igh-Quality in-depth 0ideo 1-Learning Sessions
•
232 hours of Lab 14ercises
•
"ntellipaat roprietary 0) for Lifetime and free cloud access for ! months for performing e4ercises5
•
/36 of e4tensive learning through #ands-on e4ercises, ro7ect 8or, Assignments and Qui99es
•
%he training 'ill prepare you for multiple rofessional &ertification 14ams*
Cloudera Certication: &&A Spar and #adoop Developer, &&A#, R &&A#, R Certification, )ahout &ertification, &ertification, &loudera &ertification :&&*DS;,, Apache Storm &ertification :&&*DS; &ertification,, &loudera Apache #Base &ertification, &ertification, Apache Cassandra Professional Certification, )ongoDB &ertification, &ertification, Apache Spark Certification •
2 Lifetime Support 'ith (apid roblem (esolution =uaranteed
•
Lifetime Access Access to 0ideos, %utorials and &ourse )aterial
•
=uidance to (esume reparation and >ob Assistance
•
Step -by- Step "nstallation of multiple Soft'ares
•
&ourse &ompletion &ertificate from "ntellipaat
About Big Data, Data Science ( Combo Course %hrough this e4ceptionally elaborative course, learners can ac?uire outstanding sills re?uired by Big Data $Data Scientist 14pert and gain in-depth no'ledge on Development, Administration Administration and Analysis profile and integration of multiple systems together5=aining e4pertise in as many as ! technologies at one time on a single order is the ultimate ticet to your dream 7ob, top-notch company and huge earnings5 "ntellipaat@s All in ne Big Data and Data Science &ombo course endo's you 'ith the most endorsed technologies lie #adoop, Spar, Storm, Scala, NoSQL, )ahout, Splun, Sol r, Data Science, ( rogramming and core statistics and probability5%his probability5 %his training course is a have-it-all pacage to produce silled, competent and leading Big Data Scientist and Architects5 Architects5 1nrolling for this course 'ill give individuals in-depth no'ledge and scope of being identified by the top multinationals 'orld'ide5
Project Work:
Hadoop Projects
1. Project $ 8oring 'ith )ap (educe, (educe, #ive, S?oop
Problem Statement –
•
"ntensive Learning on #adoop $ #adoop Architect Architect %raining $ All in &ombo &ourse 'hich includes #adoop Developer, #adoop Analyst, #adoop Administration and #adoop %esting, %esting, ( programming %raining, )ahout %raining, Data Science %raining* Building (ecommender Systems, Statistics and robability %raining, Apache Solr %raining, %raining, Splun %raining, %raining, apache Storm %raining, %raining, Splun admin %raining, %raining, #Base %raining, %raining, &assandra %raining, )ongoDB %raining and Apache Spar and Scala %raining
•
./ hours of #igh-Quality in-depth 0ideo 1-Learning Sessions
•
232 hours of Lab 14ercises
•
"ntellipaat roprietary 0) for Lifetime and free cloud access for ! months for performing e4ercises5
•
/36 of e4tensive learning through #ands-on e4ercises, ro7ect 8or, Assignments and Qui99es
•
%he training 'ill prepare you for multiple rofessional &ertification 14ams*
Cloudera Certication: &&A Spar and #adoop Developer, &&A#, R &&A#, R Certification, )ahout &ertification, &ertification, &loudera &ertification :&&*DS;,, Apache Storm &ertification :&&*DS; &ertification,, &loudera Apache #Base &ertification, &ertification, Apache Cassandra Professional Certification, )ongoDB &ertification, &ertification, Apache Spark Certification •
2 Lifetime Support 'ith (apid roblem (esolution =uaranteed
•
Lifetime Access Access to 0ideos, %utorials and &ourse )aterial
•
=uidance to (esume reparation and >ob Assistance
•
Step -by- Step "nstallation of multiple Soft'ares
•
&ourse &ompletion &ertificate from "ntellipaat
About Big Data, Data Science ( Combo Course %hrough this e4ceptionally elaborative course, learners can ac?uire outstanding sills re?uired by Big Data $Data Scientist 14pert and gain in-depth no'ledge on Development, Administration Administration and Analysis profile and integration of multiple systems together5=aining e4pertise in as many as ! technologies at one time on a single order is the ultimate ticet to your dream 7ob, top-notch company and huge earnings5 "ntellipaat@s All in ne Big Data and Data Science &ombo course endo's you 'ith the most endorsed technologies lie #adoop, Spar, Storm, Scala, NoSQL, )ahout, Splun, Sol r, Data Science, ( rogramming and core statistics and probability5%his probability5 %his training course is a have-it-all pacage to produce silled, competent and leading Big Data Scientist and Architects5 Architects5 1nrolling for this course 'ill give individuals in-depth no'ledge and scope of being identified by the top multinationals 'orld'ide5
Project Work:
Hadoop Projects
1. Project $ 8oring 'ith )ap (educe, (educe, #ive, S?oop
Problem Statement –
"t describes ho' to import )ySQL data using S?oop and ?uerying it using hive and also describes ho' to run the 'ord count )ap(educe 7ob5
. Project – 8or on )ovie lens data for finding top records
!ata – )ovie Lens Dataset
Problem Statement – "t includes* •
8rite a )ap(educe program to find the top 3 movies from the u5data file
•
&reate the same top 3 movies using "= by loading u5data into pig
•
&reate the same top 3 movies using #"01 by loading u5data into #"01
". Project – #adoop arn arn ro7ect $ 1nd to 1nd o&
Problem Statement – "t includes* •
•
"mport )ovie data Append the data
•
#o' to use s?oop commands to bring the data into the #DCS
•
1nd to 1nd flo' of transaction data
•
#o' to process the real 'ord data or a huge amount of data using )ap(educe program in terms of the movie etc5
#. Project – artitioning %ables
Problem Statement – "t describes the parting and #o' to perform portioning5 "t includes* •
)anual artitioning
•
Dynamic artitioning
•
Buceting
$. Project – Sales &ommission
!ata – Sales – Sales Problem Statement –
"n this 'e calculate the commission according to the sales5
%. Project – &onnecting entaho 'ith #adoop 1cosystem
Problem Statement – "t includes* •
Quic vervie' of 1%L and B"
•
&onfiguring entaho to 'or 'ith #adoop Distribution
•
Loading data into #adoop cluster
•
%ransforming data into #adoop cluster
•
14tracting data from #adoop &luster
&. Project – )ulti-node &luster Setup
Problem Statement – "t includes follo'ing actions* •
#adoop )ulti Node &luster Setup using Ama9on ec $ &reating 2 node cluster setup
•
(unning )ap (educe >obs on &luster
'. Project – #adoop %esting using )(
Problem Statement $ "t describes ho' to test map reduce codes 'ith )( unit5
(. Project – #adoop 8eblog Analytics
!ata – 8eblogs
Problem Statement – %he goal is to enable the participants to have a feel of the actual data sets in a production environment and ho' to load the data into a #adoop cluster using various techni?ues5 nce data is loaded, the ne4t goal is to perform basic analytics on this data5
) Pro*rammin* Project – (estaurant (evenue rediction
!ata – )e*enue Data set Problem Statement – "t predicts the annual restaurant sales based on the ob7ective measurements5 "t uses follo'ing data fields* •
"d
•
pening Date
•
%ype of the &ity
•
%ype of the (estaurant
•
%hree categories of bfuscated Data
•
(evenue "t also includes*
•
Data vervie'
•
Data Cields
•
1valuation using ()S1
•
Ceature 1ngineering Selection
!ata Science Projects: Project 1+ ,nderstandin* Cold Start Problem in !ata Science •
Algorithms for (ecommender
•
8ays of (ecommendation
•
%ypes of (ecommendation -&ollaborative Ciltering Based (ecommendation, &ontent-Based (ecommendation
•
&old Start roblem
Project + )ecommendation for -oie/ Summar0 •
(ecommendation for movie
•
%'o %ypes of redictions $ (ating rediction, "tem rediction
•
"mportant Approaches* )emory Based and )odel-Based
•
Eno'ing Fser Based )ethods in E-Nearest Neighbor
•
Fnderstanding "tem Based )ethod
•
)atri4 Cactori9ation
•
Decomposition of Singular 0alue
•
Data Science ro7ect discussion
•
&ollaboration Ciltering
•
Business 0ariables vervie'
SP Project – Data Analysis ro7ect
!ata – Sales Problem Statement – "t includes the follo'ing actions* •
Fnderstand the business solutions
•
Discussion 'ith the 'arehouse team
•
Data &ollection G Storage
•
Data &leaning
•
Build a #ypothesis %ree around the business problem
•
roduce the final result5
2pache Solr Project – Cunction Queries
Problem Statement ( +t describes that how to use function ueries in Solr, su--ose an inde. store the dimensions in meters ., ", / of some h"-othetical bo.es with arbitrar" names stored in 0eld bo.name! Su--ose we want to search for bo. matching name 0ndbo. but ranked according to *olumes of bo.es!
Splunk Project: •
%he Splun ro7ect, after finishing this training course, 'ill let you create a report and dashboard 'ith the te4t file having employee details5
•
ou 'ill perform various ro' operations to fetch data as per your re?uirements and use important Splun commands on the file to e4tract certain fields5
•
ther significant aspects of this pro7ect are editing the event, adding tags, searching event 'ith tag names and saving tag search5
Splunk 2dmin Project – &ield .traction Problem Statement – +t includes' •
•
About Cield 14traction Cield 14tractor Ftility
•
Cield 14traction page in Splun 8eb
•
&onfigure field e4traction in configuration files etc5
2pache Storm Projects: •
(eal-time ro7ect on Storm
•
%he ro7ect Bolt Blue rint
H3ase Project – +ntegrate 2i*e and 3a*a with 2Base Problem Statement – 4his -ro#ect describes that how to integrate hi*e and #a*a wit h 2Base! +t includes following actions' •
"nstallation of #Base
•
&reation of %able
•
>ava rogram to create the table in #Base
•
)anaging the #Base %able 'ith #ive
•
Bul "mport etc5
-on*o!3 Project – 3a*a MongoDB +ntegration Problem Statement – +t creates a table to insert the *i deo 0le using the #a*a -rogram! &or this it -erforms following actions' •
•
"nstallation of >ava Adding )ongoDB >ava &onnector etc5
2pache Spark Projects: -ini Projects ro7ect 5 List the itemsro7ect 5 Sorting of (ecordsro7ect .5 Sho' a histogram of date vs users created5 ptionally, use a rich visuali9ation liero7ect 25 repare a map of tags vs H of ?uestions in each tag and display it5Major Projects ro7ect )ovie (ecommendationro7ect %'itter A" "ntegration for t'eet Analysisro7ect . Data 14ploration Fsing Spar SQL $ 8iipedia dataset
Curriculum 2adooModule 1 ( +ntroduction to Big Data 5 2adoo-, 2adoo- cos"stem, Ma- )educe and 2D&S •
6hat is Big Data7
•
&actors constituting Big Data
•
2adoo- and its cos"stem
•
Ma- )educe 8Conce-ts of Ma-, )educe, 9rdering, Concurrenc", Shu:e, )educing, Concurrenc"
•
2adoo- Distributed &ile S"stem ;2D&S< Conce-ts and its +m-ortance
•
Dee- Di*e in Ma- )educe ( .ecution &ramework, =artitioner, Combiner, Data 4"-es, %e" -airs
•
2D&S Dee- Di*e ( Architecture, Data )e-lication, Name Node, Data Node, Data &low
•
=arallel Co-"ing with D+S4C=, 2adoo- Archi*es
Assignment ( 1 Module > ( 2ands8on .ercises •
+nstalling 2adoo- in =seudo Distributed Mode, ?nderstanding +m-ortant con0guration 0les, their =ro-erties and Demon 4hreads
•
Accessing 2D&S from Command Line
•
Ma- )educe ( Basic .ercises
•
?nderstanding Big Data 2adoo- cos"stem
•
+ntroduction to Soo-, use cases and +nstallation
•
+ntroduction to 2i*e, use cases and +nstallation
•
+ntroduction to =ig, use cases and +nstallation
•
+ntroduction to 9o/ie, use cases and +nstallation
•
+ntroduction to &lume, use cases and +nstallation
•
+ntroduction to @arn
Assignment 8> and Mini =ro#ect ( +m-orting M"sl Data using Soo- and Quer"ing it using 2i*e
Module ( Dee- Di*e in Ma- )educe! 1!
Ma--er 5 )educer
•
)elation between in-ut s-lits and 2D&S blocks!
•
Ma- reduce #ob submission ow of in-ut s-lits!
>!
2ow Ma--er and Combiner 6orks
Ma--er and Combiners 6ork
•
!
Shu:e 5 Sort =hase,Combiner 5 =artitioner!
•
Ma- )educe in detail!
•
Com-arison bw @A)N and M)1
•
Ma-)educe #ob .ecution!
•
Ma-)educe Combiner!
•
Ma-reduce =artitioner!
•
shu:e 5 Sort =hase!
E!
3ob Scheduler Ma- reduce #ob submission ow
•
•
3ob launch -rocess ;3ob<
•
3ob launch =rocess ;task<
•
3ob launch -rocess ;4ask tracker<
•
3ob launch -rocess ;4ask runner<
F!
3oining 9f &ilesDatasets 3oining Data sets in Ma- )educe!
•
Distributed cache!
•
!
)educe 3oins
•
Counters
•
)educe 3oin G!
•
H! •
I!
+n-ut &ormat Custom +n-ut &ormat!
+n*erted +nde.ing! Ma- )educe ( +n*erted +nde.ing 2adoo- A=+Js 1K!.-lanation of Ma-)educe organi/ation!
•
2ow ma--er -rocess with detailed e.am-le testing module!
•
2ow to de*elo- Ma- )educe A--lication!
•
writing unit test Best =ractices for de*elo-ing and writing!
•
Debugging Ma- )educe a--lications!
Module !1 •
=ro#ect 18 2ands on e.ercise ( end to end =oC using @arn or 2adoo- >!
1!
)eal 6orld 4ransactions handling of Bank
>!
Mo*ing data using Soo- to 2D&S
!
+ncremental u-date of data to 2D&S
E!
)unning Ma- )educe =rogram
F!
)unning 2i*e ueries for data anal"tics
•
=ro#ect >8 2ands on e.ercise ( end to end =oC using @arn or 2adoo- >!G )unning Ma- )educe Code for Mo*ie )ating and 0nding their fans and a*erage rating
Assignment 8E and F Module E ( Dee- Di*e in =ig 1!
+ntroduction to =ig
•
6hat +s =ig7
•
=igJs &eatures
•
=ig ?se Cases
•
+nteracting with =ig
>!
Basic Data Anal"sis with =ig
•
=ig Latin S"nta.
•
Loading Data
•
Sim-le Data 4"-es
•
&ield De0nitions
•
Data 9ut-ut
•
iewing the Schema
•
&iltering and Sorting Data
•
Commonl"8?sed &unctions
•
2ands89n .ercise' ?sing =ig for 4L =rocessing
!
=rocessing Com-le. Data with =ig
•
Com-le.Nested Data 4"-es
•
rou-ing
•
+terating rou-ed Data
•
2ands89n .ercise' Anal"/ing Data with =ig
E!
Multi8Dataset 9-erations with =ig 4echniues for Combining Data Sets
•
3oining Data Sets in =ig
•
•
Set 9-erations
•
S-litting Data Sets
•
2ands89n .ercise
F!
.tending =ig
•
Macros and +m-orts
•
?D&s
•
?sing 9ther Languages to =rocess Data with =ig
•
2ands89n .ercise' .tending =ig with Streaming and ?D&s !
=ig 3obs
Case studies of &ortune FKK com-anies which are lectronic Arts and 6almart with real data sets! Assignment ( Module F ( Dee- Di*e in 2i*e 1!
+ntroduction to 2i*e
•
6hat +s 2i*e7
•
2i*e Schema and Data Storage
•
Com-aring 2i*e to 4raditional Databases
•
2i*e *s! =ig
•
2i*e ?se Cases
•
+nteracting with 2i*e >!
)elational Data Anal"sis with 2i*e
•
2i*e Databases and 4ables
•
Basic 2i*eQL S"nta.
•
Data 4"-es 3oining Data Sets
•
•
Common Built8in &unctions
•
2ands89n .ercise' )unning 2i*e Queries on the Shell, Scri-ts, and 2ue !
2i*e Data Management
•
2i*e Data &ormats
•
Creating Databases and 2i*e8Managed 4ables
•
Loading Data into 2i*e
•
Altering Databases and 4ables
•
Self8Managed 4ables
•
Sim-lif"ing Queries with iews
•
Storing Quer" )esults
•
Controlling Access to Data
•
2ands89n .ercise' Data Management with 2i*e
E!
2i*e 9-timi/ation
•
?nderstanding Quer" =erformance
•
=artitioning
•
Bucketing
•
F! •
+nde.ing Data .tending 2i*e ?ser8De0ned &unctions
!
2ands on .ercises ( =la"ing with huge data and Quer"ing e.tensi*el"!
!
?ser de0ned &unctions, 9-timi/ing Queries, 4i-s and 4ricks for -erformance tuning
Assignment ( G Module ( +m-ala 1!
+ntroduction to +m-ala
•
6hat is +m-ala7
•
2ow +m-ala Diers from 2i*e and =ig
•
2ow +m-ala Diers from )elational Databases
•
Limitations and &uture Directions
•
?sing the +m-ala Shell
>!
Choosing the Best ;2i*e, =ig, +m-ala<
>!
Modeling and Managing Data with +m-ala and 2i*e
•
Data Storage 9*er*iew
•
Creating Databases and 4ables
•
Loading Data into 4ables
•
2Catalog
•
+m-ala Metadata Caching
E!
Data =artitioning
•
=artitioning 9*er*iew
•
=artitioning in +m-ala and 2i*e
Module G ( ;A)9
Selecting a &ile &ormat
•
2adoo- 4ool Su--ort for &ile &ormats
•
A*ro Schemas
•
?sing A*ro with 2i*e and Soo-
•
A*ro Schema *olution
•
Com-ression
Module H ( +ntroduction to 2base architecture •
6hat is 2base
•
6here does it 0ts
•
6hat is N9SQL
Assignment 8H A-ache S-ark Module I ( 6h" S-ark7 .-lain S-ark and 2adoo- Distributed &ile S"stem •
6hat is S-ark
•
Com-arison with 2adoo-
•
Com-onents of S-ark
Module 1K ( S-ark Com-onents, Common S-ark Algorithms8+terati*e Algorithms, ra-h Anal"sis, Machine Learning •
A-ache S-ark8 +ntroduction, Consistenc", A*ailabilit", =artition
•
?ni0ed Stack S-ark
•
S-ark Com-onents
•
Com-arison with 2adoo- ( Scalding e .am-le, mahout, storm, gra-h
Module 11 ( )unning S-ark on a Cluster, 6riting S-ark A--lications using ="thon, 3a*a, Scala •
.-lain -"thon e.am-le
•
Show installing a s-ark
•
.-lain dri*er -rogram
•
.-laining s-ark conte.t with e.am-le
•
De0ne weakl" t"-ed *ariable
•
Combine scala and #a*a seamlessl"!
•
.-lain concurrenc" and distribution!
•
.-lain what is trait!
•
.-lain higher order function with e.am-le!
•
De0ne 9&+ scheduler!
•
Ad*antages of S-ark
•
.am-le of Lamda using s-ark
•
.-lain Ma-reduce with e.am-le
Module 1> ( 2adoo- Cluster Setu- and )unning Ma- )educe 3obs •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
•
)unning Ma- )educe 3obs on Cluster
Module 1 ( Ma#or =ro#ect ( =utting it all together and Connecting Dots •
=utting it all together and Connecting Dots
•
6orking with Large data sets, Ste-s in*ol*ed in anal"/ing large data
Assignment ( I, 1K Module 1E ( Ad*ance Ma-reduce •
Del*ing Dee-er +nto 4he 2adoo- A=+
•
More Ad*anced Ma- )educe =rogramming, 3oining Data Sets in Ma- )educe
•
ra-h Mani-ulation in 2adoo-
Assignment ( 11, 1> Module 1F ( 4L Connecti*it" with 2adoo- cos"stem •
2ow 4L tools work in Big data +ndustr"
•
Connecting to 2D&S from 4L tool and mo*ing data from Local s"stem to 2D&S
•
Mo*ing Data from DBMS to 2D&S
•
6orking with 2i*e with 4L 4ool
•
Creating Ma- )educe #ob in 4L tool
•
nd to nd 4L =oC showing 2adoo- integration with 4L tool!
Module 1 ( 2adoo- Cluster Con0guration •
2adoo- con0guration o*er*iew and im-ortant con0guration 0le
•
Con0guration -arameters and *alues
•
2D&S -arameters Ma-)educe -arameters
•
2adoo- en*ironment setu-
•
+ncludeJ and .cludeJ con0guration 0les Lab' Ma-)educe =erformance 4uning
Module 1G ( 2adoo- Administration and Maintenance •
NamenodeDatanode director" structures and 0les
•
&ile s"stem image and dit log
•
4he Check-oint =rocedure
•
Namenode failure and reco*er" -rocedure
•
Safe Mode
•
Metadata and Data backu-
•
=otential -roblems and solutions what to look f or
•
Adding and remo*ing nodes Lab' Ma-)educe &ile s"stem )eco*er"
Module 1H ( 2adoo- Monitoring and 4roubleshooting •
Best -ractices of monitoring a 2adoo- cluster
•
?sing logs and stack traces for monitoring and troubleshooting
•
?sing o-en8source tools to monitor 2adoo- cluster
Module 1I ( 3ob Scheduling •
2ow to schedule 2adoo- 3obs on the same cluster
•
Default 2adoo- &+&9 Schedule
•
&air Scheduler and its con0guration
Module >K ( 2adoo- Multi Node Cluster Setu- and )unning Ma- )educe 3obs on Ama/on c> •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
•
)unning Ma- )educe 3obs on Cluster
Module >1 ( O99%=) •
O99%=) +ntroduction
•
O99%=) use cases
•
O99%=) Ser*ices
•
O99%=) data Model
•
Onodes and its t"-es
•
Onodes o-erations
•
Onodes watches
•
Onodes reads and writes
•
Consistenc" uarantees
•
Cluster management
•
Leader lection
•
Distributed .clusi*e Lock
•
+m-ortant -oints
Module >> ( Ad*ance 9o/ie •
6h" 9o/ie7
•
+nstalling 9o/ie
•
)unning an e.am-le
•
9o/ie8 workow engine
•
.am-le M) action
•
6ord count e.am-le
•
6orkow a--lication
•
6orkow submission
•
6orkow state transitions
•
9o/ie #ob -rocessing
•
9o/ie8 2AD99= securit"
•
6h" 9o/ie securit"7
•
3ob submission to hadoo-
•
Multi tenanc" and scalabilit"
•
4ime line of 9o/ie #ob
•
Coordinator
•
Bundle
•
La"ers of abstraction
•
Architecture
•
?se Case 1' time triggers
•
?se Case >' data and time triggers
•
?se Case ' rolling window
Module > ( Ad*ance &lume •
A-ache &lume
•
Big data ecos"stem
•
=h"sicall" distributed Data sources
•
Changing structure of Data
•
Closer look
•
Anatom" of &lume
•
Core conce-ts
•
*ent
•
Clients
•
Agents
•
Source
•
Channels
•
Sinks
•
+nterce-tors
•
Channel selector
•
Sink -rocessor
•
Data ingest
•
Agent -i-eline
•
4ransactional data e.change
•
)outing and re-licating
•
6h" channels7
•
?se case8 Log aggregation
•
Adding ume agent
•
2andling a ser*er farm
•
Data *olume -er agent
•
.am-le describing a single node ume de-lo"ment
Module >E ( Ad*ance 2? •
2? introduction
•
2? ecos"stem
•
6hat is 2?7
•
2? real world *iew
•
Ad*antages of 2?
•
2ow to u-load data in &ile Browser7
•
iew the content
•
+ntegrating users
•
+ntegrating 2D&S
•
&undamentals of 2? &)9N4ND
Module >F ( Ad*ance +m-ala •
+M=ALA 9*er*iew' oals
•
?ser *iew of +m-ala' 9*er*iew
•
?ser *iew of +m-ala' SQL
•
?ser *iew of +m-ala' A-ache 2Base
•
+m-ala architecture
•
+m-ala state store
•
+m-ala catalogue ser*ice
•
Quer" e.ecution -hases
•
Com-aring +m-ala to 2i*e
4esting Module > ( 2adoo- Stack +ntegration 4esting •
6h" 2adoo- testing is im-ortant
•
?nit testing
•
+ntegration testing
•
=erformance testing
•
Diagnostics
•
Nightl" QA test
•
Benchmark and end to end tests
•
&unctional testing
•
)elease certi0cation testing
•
Securit" testing
•
Scalabilit" 4esting
•
Commissioning and Decommissioning of Data Nodes 4esting
•
)eliabilit" testing
•
)elease testing
Module >G ( )oles and )es-onsibilities of 2adoo- 4esting •
•
?nderstanding the )euirement, -re-aration of the 4esting stimation, 4est Cases, 4est Data, 4est bed creation, 4est .ecution, Defect )e-orting, Defect )etest, Dail" Status re-ort deli*er", 4est com-letion! 4L testing at e*er" stage ;2D&S, 2+, 2BAS< while loading the in-ut ;logs0lesrecords etc< using soo-ume which includes but not limited to data *eri0cation, )econciliation!
•
?ser Authori/ation and Authentication testing ;rou-s, ?sers, =ri*ileges etc<
•
)e-ort defects to the de*elo-ment team or manager and dri*ing them to closure!
•
Consolidate all the defects and create defect re-orts!
•
alidating new feature and issues in Core 2adoo-!
Module >H ( &ramework called M) ?nit for 4esting of Ma-8)educe =rograms •
)e-ort defects to the de*elo-ment team or manager and dri*ing them to closure!
•
Consolidate all the defects and create defect re-orts!
•
alidating new feature and issues in Core 2adoo-
•
)es-onsible for creating a testing &ramework called M) ?nit for testing of Ma-8)educe -rograms!
Module >I ( ?nit 4esting •
Automation testing using the 99O+!
•
Data *alidation using the uer" surge tool!
Module K ( 4est .ecution of 2adoo- Pcustomi/ed •
4est -lan for 2D&S u-grade
•
4est automation and result
Module 1 ( 4est =lan Strateg" 4est Cases of 2adoo- 4esting •
2ow to test install and con0gure
Module > ( 2igh A*ailabilit" &ederation, @arn and Securit" Module ( 3ob and Certi0cation Su--ort •
Ma#or =ro#ect on Big Data and 2adoo-, 2adoo- De*elo-ment, Cloudera Certi0cation 4i-s and uidance and Mock +nter*iew =re-aration, =ractical De*elo-ment 4i-s and 4echniues, certi0cation -re-aration
=ro#ect 6ork 1!
=ro#ect ( 6orking with Ma- )educe, 2i*e, Soo-
=roblem Statement ( +t describes that how to im-ort m"sl data using soo- and uer"ing it using hi*e and also describes that how to run the word count ma-reduce #ob!
>! =ro#ect ( 6ork on Mo*ie lens data for 0nding to- records
Data ( Mo*ie Lens dataset
=roblem Statement ( +t includes' •
6rite a Ma-)educe -rogram to 0nd the to- 1K mo*ies from the u!data 0le
•
Create the same to- 1K mo*ies using =+ b" loading u!data into -ig
•
Create the same to- 1K mo*ies using 2+ b" loading u!data into 2+
! =ro#ect ( 2adoo- @arn =ro#ect ( nd to nd =oC
=roblem Statement ( +t includes' •
+m-ort Mo*ie data
•
A--end the data
•
2ow to use soo- commands to bring the data into the hdfs
•
nd to nd ow of transaction data
•
2ow to -rocess the real word data or huge amount of data using ma- reduce -rogram in terms of mo*ie etc!
E! =ro#ect ( =artitioning 4ables
=roblem Statement ( +t describes about the -arting and 2ow to -erform -ortioning! +t includes' •
Manual =artitioning
•
D"namic =artitioning
•
Bucketing
F! =ro#ect ( Sales Commission
Data ( Sales
=roblem Statement ( +n this we calculate the commission according to the sales!
! =ro#ect ( Connecting =entaho with 2adoo- co8s"stem
=roblem Statement ( +t includes' •
Quick 9*er*iew of 4L and B+
•
Con0guring =entaho to work with 2adoo- Distribution
•
Loading data into 2adoo- cluster
•
4ransforming data into 2adoo- cluster
•
.tracting data from 2adoo- Cluster
G! =ro#ect ( Multinode Cluster Setu-
=roblem Statement ( +t includes following actions' •
2adoo- Multi Node Cluster Setu- using Ama/on ec> ( Creating E node cluster setu-
•
)unning Ma- )educe 3obs on Cluster
H! =ro#ect ( 2adoo- 4esting using M)
=roblem Statement ( +t describes that how to test ma- reduce codes with M) unit!
I! =ro#ect ( 2adoo- 6eblog Anal"tics
Data (
6eblogs
=roblem Statement ( 4he goal is to enable the -artici-ants to ha*e a feel of the actual data sets in a -roduction en*ironment and how to load the data into a 2adoo- cluster using *arious techniues! 9nce data is loaded, the ne.t goal is to -erform basic anal"tics on this data!
) =rogramming Module 1 ( 2ow ) 6orks •
Data mining ?sing Statistical -ackages
•
A &ew conce-ts Before Starting
Module > -art 1 ( 6hat is )8=ackages •
)8Calculator
•
Assigning alues 4o ariables
•
ector Creation
Module > -art > ( 6hat is Sorting •
enerating )e-eats
•
6hat is re- &unction
•
enerating &actor Le*els
•
Sorting =rocess
Module > -art ( 4rans-ose &unction •
Stack &unction ?sed
Module -art 1 ( &unctions 5 )eading Data from .ternal &iles •
Merge &unction
•
Strs-lit &unction
•
Matrices
•
Matri. Mani-ulation
•
)ow Sums
Module -art > ( enerating =lots and =ie Charts •
Line =lots
•
Bar =lots
•
Bar =lots &or =o-ulation
•
2istogram
•
=ie Chart Com-onents
Module E -art 1 ( Anal"sis of arianc" ;AN9A< •
9ne 6a" Anal"sis of ariance
•
4wo 6a" Anal"sis of ariance
Module E -art > ( 6hat is Cluster Anal"sis •
%8Means Clustering
•
Cluster Algorithm 6orking
Module F -art 1 ( Association )ule Mining Anit" Anal"sis •
Association )ule Mining Anit" Anal"sis
Module F -art > ( 4wo ariable )elationShi-s •
Linear )egression
•
De-endent And +nde-endent ariables
•
Scatter =lots
Module =art 1 ( Database connecti*it" 5 Logistic )egression •
Logistic )egression
•
.am-les of Logistic )egression
•
Logistic )egression in )
•
=redication
Module =art > ( )9C Cur*e in )
•
Confusion Matri.
•
)9C Cur*e in )
•
Sensiti*it" 5 S-eci0cit"
•
Data Base Connecti*it" )9DBC
•
)eading Data to 9DBC 4ables
•
&unction ;Mean<
•
.am-les 9f &unction
-odule & – +ntegrating ) with 2adoo•
Methods to integrate two -o-ular o-en source softwares for Big Data anal"tics' ) and 2adoo-
•
+ntegrating ) with 2adoo- using )2adoo- and )M) -ackage
•
.-loring )2+= ;) 2adoo- +ntegrated =rogramming n*ironment<
•
6riting Ma-)educe 3obs in ) and e.ecuting them on 2adoo-
=ro#ect ( )estaurant )e*enue =rediction
Data ( )e*enue Data set
=roblem Statement ( +t -redicts the annual restaurant sales based on the ob#ecti*e measurements! +t uses followi ng data 0elds' •
+d
•
9-ening Date
•
4"-e of the Cit"
•
4"-e of the )estaurant
•
4hree categories of 9bfuscated Data
•
)e*enue +t also includes'
•
Data 9*er*iew
•
Data &ields
•
*aluation using )MS
•
&eature ngineering Selection
Mahout Module 1 (Mahout 9*er*iew •
Classi0cation and )ecommendation
•
Clustering in Mahout
•
=attern Mining
•
?nderstanding machine Learning
•
?sing Model diagram to decide the a--roach
•
Data ow
•
Su-er*ised and ?nsu-er*ised learning
Module > ( Mahout )ecommendations •
Conce-t of )ecommendation
•
)ecommendations b" 8commerce site
•
Com-arison between ?ser )ecommendations and +tem recommendation
•
De0ne recommenders and Classi0ers
•
=rocess of Collaborati*e &iltering
•
.-laining =earson coecient algorithm
•
uclidean distance measure
•
+m-lementing a recommender using ma- reduce
Module ( Clustering Session 1 •
De0ning Clustering
•
?ser8to8user similarit"
•
Clustering +llustration
•
uclidean distance measure
•
Distance measure *ector
•
?nderstanding the -rocess of Clustering
•
ectori/ing documents8?nstructured data
Module ( Clustering Session > •
Document clustering
•
Seuence8to8s-arse ?tilit"
•
%8Mean Clustering
Module ( Clustering Session Module E ( Classi0cation Session 1 •
4erminolog"
•
=redictor and 4arget *ariable
•
Classi0able Data
•
%e" Challenges in Classi0cation algorithm
•
ectori/ing Continuous data
•
Classi0cation .am-les
•
Logic )egression and its e.am-les
Module E ( Clustering and Classi0cation Session > •
Clustering
•
Clustering =rocess
•
4ransaction Clustering
•
Dierent techniues of ectori/ation
•
Distance measure
•
Clustering algorithm8%8MAN
•
Clustering A--lication81
•
Clustering A--lication8>
•
Sentiment Anal"/er
Module F ( =attern Mining •
=earson Coecient
•
Collaborati*e &iltering =rocess
•
Collaborati*e &iltering
•
Similarit" Algorithms
•
=earson Correlation
•
uclidean Distance Measure 8&reuent =attern 5 Association rules
•
&reuent =attern rowth
Session ( Course Summar"
Data Science Module 1 ( etting started with Data Science and )ecommender S"stems •
Data Science 9*er*iew
•
)easons to use Data Science
•
=ro#ect Lifec"cle
•
Data Acuirement
•
*aluation of +n-ut Data
•
4ransforming Data
•
Statistical and anal"tical methods to work with data
•
Machine Learning basics
•
+ntroduction to )ecommender s"stems
•
A-ache Mahout 9*er*iew
Module > ( )easons to ?se, =ro#ect Lifec"cle •
6hat is Data Science7
•
6hat %ind of =roblems can "ou sol*e7
•
Data Science =ro#ect Life C"cle
•
Data Science8Basic =rinci-les
•
Data Acuisition
•
Data Collection
•
?nderstanding Data8 Attributes in a Data, Dierent t"-es of ariables
•
Build the ariable t"-e 2ierarch"
•
4wo Dimensional =roblem
•
Co8relation bw the ariables8 e.-lain using =aint 4ool
•
9utliers, 9utlier 4reatment
•
Bo.-lot, 2ow to Draw a Bo.-lot
Module ( Acuiring Data •
Discussion on Bo.-lot8 also .-lain
•
.am-le to understand *ariable Distributions
•
6hat is =ercentile7 ( .am-le using )studio tool
•
2ow do we identif" outliers7
•
2ow do we handle outliers7
•
9utlier 4reatment' ?sing Ca--ing&looring eneral Method
•
Distribution8 6hat is Normal Distribution7
•
6h" Normal Distribution is so -o-ular7
•
?niform Distribution
•
Skewed Distribution
•
4ransformation
Module E ( Machine Learning in Data Science •
Discussion about Bo.-lot and 9utlier
•
oal' +ncrease =ro0ts of a Store
•
Areas of increasing the ecienc"
•
Data )euest
•
Business =roblem' 4o ma.imi/e sho- =ro0ts
•
6hat are +nterlinked *ariables
•
6hat is Strateg"
•
+nteraction bw the ariables
•
?ni*ariate anal"sis
•
Multi*ariate anal"sis
•
Bi*ariate anal"sis
•
)elation bw ariables
•
Standardi/e ariables
•
6hat is 2"-othesis7
•
+nter-ret the Correlation
•
Negati*e Correlation
•
Machine Learning
Module F (Statistical and anal"tical methods dealing with data, +m-lementation of )ecommenders using A-ache Mahout and 4ransforming Data •
Correlation bw Nominal ariables
•
Contingenc" 4able
•
6hat is .-ected alue7
•
6hat is Mean7
•
2ow .-ected alue is dier from Mean
•
.-eriment ( Controlled .-eriment, ?ncontrolled .-eriment
•
Degree of &reedom
•
De-endenc" bw Nominal ariable 5 Continuous ariable
•
Linear )egression
•
.tra-olation and +nter-olation
•
?ni*ariate Anal"sis for Linear )egression
•
Building Model for Linear )egression
•
=attern of Data means7
•
Data =rocessing 9-eration
•
6hat is sam-ling7
•
Sam-ling Distribution
•
Strati0ed Sam-ling 4echniue
•
Dis-ro-ortionate Sam-ling 4echniue
•
Balanced Allocation8-art of Dis-ro-ortionate Sam-ling
•
S"stematic Sam-ling
•
Cluster Sam-ling
•
> angels of Data Science8Statistical Learning, Machine Learning
Module ( 4esting and Assessment, =roduction De-lo"ment and More •
Multi *ariable anal"sis
•
linear regration
•
Sim-le linear regration
•
2"-othesis testing
•
S-eculation *s! claim;Quer"<
•
Sam-le
•
Ste- to test "our h"-othesis
•
-erformance measure
•
enerate null h"-othesis
•
alternati*e h"-othesis
•
4esting the h"-othesis
•
4hreshold *alue
•
2"-othesis testing e.-lanation b" e.am-le
•
Null 2"-othesis
•
Alternati*e 2"-othesis
•
=robabilit"
•
2istogram of mean *alue
•
)e*isit C2+8SQ?A) inde-endence test
•
Correlation between Nominal ariable
Module G ( Business Algorithms, Sim-le a--roaches to =rediction, Building model, Model de-lo"ment •
Machine Learning
•
+m-ortance of Algorithms
•
Su-er*ised and ?nsu-er*ised Learning
•
arious Algorithms on Business
•
Sim-le a--roaches to =rediction
•
=redict Algorithms
•
=o-ulation data
•
sam-ling
•
Dis-ro-ortionate Sam-ling
•
Ste-s in Model Building
•
Sam-le the data
•
6hat is %7
•
4raining Data
•
4est Data
•
alidation data
•
Model Building
•
&ind the accurac"
•
)ules
•
+teration
•
De-lo" the model
•
Linear regression
Module H ( etting started with Segmentation of =rediction and Anal"sis •
Clustering
•
Cluster and Clustering with .am-le
•
Data =oints, rou-ing Data =oints
•
Manual =ro0ling
•
2ori/ontal 5 ertical Slicing
•
Clustering Algorithm
•
Criteria for take into Consideration before doing Clustering
•
ra-hical .am-le
•
Clustering 5 Classi0cation' .clusi*e Clustering, 9*erla--ing Cl ustering, 2ierarch" Clustering
•
Sim-le A--roaches to =rediction
•
Dierent t"-es of Distances' 1!Manhattan, >!uclidean, !Consine Similarit"
•
Clustering Algorithm in Mahout
•
=robabilistic Clustering
•
=attern Learning
•
Nearest Neighbor =rediction
•
Nearest Neighbor Anal"sis
Module I ( +ntegration of ) and 2adoo•
) introduction
•
2ow ) is t"-icall" used
•
&eatures of )
•
+ntroduction to Big data
•
)R2adoo-
•
6a"s to connect with ) and 2adoo-
•
=roducts
•
Case Stud"
•
Architecture
•
Ste-s for +nstalling )+M=ALA
•
2ow to create +M=ALA -ackages
=ro#ects =ro#ect 18?nderstanding Cold Start =roblem in Data Science •
Algorithms for )ecommender
•
6a"s of )ecommendation
•
•
4"-es of )ecommendation 8Collaborati*e &iltering Based )ecommendation, Content8Based )ecommendation Cold Start =roblem
=ro#ect >8)ecommendation for Mo*ie, Summar" •
)ecommendation for mo*ie
•
4wo 4"-es of =redictions ( )ating =rediction, +tem =rediction
•
+m-ortant A--roaches' Memor" Based and Model Based
•
%nowing ?ser Based Methods in %8Nearest Neighbor
•
?nderstanding +tem Based Method
•
Matri. &actori/ation
•
Decom-osition of Singular alue
•
Data Science =ro#ect discussion
•
Collaboration &iltering
•
Business ariables 9*er*iew
Data Science Assignment •
)eal8time enter-rise -roblem
•
?se of *arious datasets to sol*e this -roblem
•
?se of ariables for =roblem )esolution
•
Building strateg" to sol*e this -roblem with the a*ailable data
•
Descri-ti*e Statistics
S=4 Module 1 ( +nformation of Statistics •
6hat is statistics
•
2ow is this useful
•
6hat is this course for
Module > ( Data Con*ersion •
Con*erting data into useful information
•
Collecting the data
•
?nderstand the data
•
&inding useful information in the data
•
+nter-reting the data
•
isuali/ing the data
Module ( 4erms of Statistics •
Descri-ti*e statistics
•
Let us understand some terms in statistics
•
ariable
ModuleE ( =lots •
Dot =lots
•
2istogram
•
Stem-lots
•
Bo. and whisker -lots
•
9utlier detection from bo. -lots and Bo. and whisker -lots
Module F ( Statistics 5 =robabilit" •
6hat is -robabilit"
•
Set 5 rules of -robabilit"
Ba"es 4heorem
•
Module ( Distributions •
=robabilit" Distributions
•
&ew .am-les
•
Student 48 Distribution
•
Sam-ling Distribution
•
Student t8 Distribution
•
=oison distribution
ModuleG ( Sam-ling •
Strati0ed Sam-ling
•
=ro-ortionate Sam-ling
•
S"stematic Sam-ling
•
= ( alue
•
Strati0ed Sam-ling
Module H ( 4ables 5 Anal"sis •
Cross 4ables
•
Bi*ariate Anal"sis
•
Multi *ariate Anal"sis
•
De-endence and +nde-endence tests ; Chi8Suare <
•
Anal"sis of ariance
•
Correlation between Nominal *ariables
=ro#ect ( Data Anal"sis =ro#ect
Data ( Sales
=roblem Statement ( +t includes the following actions'
•
?nderstand the business solutions
•
Discussion with the warehouse team
•
Data Collection 5 Storage
•
Data Cleaning
•
Build a 2"-othesis 4ree around the business -roblem
•
=roduce the 0nal result!
A-ache Solr -odule 1. he 4undamentals •
About Solr
•
+nstalling and running Solr
•
Adding content to Solr
•
)eading a Solr ML res-onse
•
Changing -arameters in the ?)L
•
?sing the browse interface -odule . Searchin*
•
Sorting results
•
Quer" -arsers
•
More ueries
•
2ardwiring reuest -arameters
•
Adding 0elds to default search
•
&aceting
•
)esult grou-ing -odule ". 5nde6in*
•
Adding "our own content to Solr
•
Deleting data from solr
•
Building a bookstore search
•
Adding book data
•
.-loring the book data
•
Dedu-e u-date -rocessor -odule #. ,pdatin* 0our schema
•
Adding 0elds to the schema
•
Anal"/ing te.t -odule $. )eleance
•
&ield weighting
•
=hrase ueries
•
&unction ueries
•
&u//ier search
•
Sounds8like -odule %. 76tended features
•
More8like8this
•
eos-atial
•
S-ell checking
•
Suggestions
•
2ighlighting
•
=seudo80elds
•
=seudo8#oins
•
Multilanguage -odule &. -ulticore
•
Adding more kinds of data -odule '. SolrCloud
•
+ntroduction
•
2ow SolrCloud works
•
Commit strategies
•
Ooo%ee-er
•
Managing Solr con0g 0les
=ro#ect ( &unction Queries
=roblem Statement ( +t describes that how to use function ueries in Solr, su--ose an inde. store the dimensions in meters ., ", / of some h"-othetical bo.es with arbitrar" names stored in 0eld bo.name! Su--ose we want to search for bo. matching name 0ndbo. but ranked according to *olumes of bo.es!
S-lunk Module 1 ( Basic Conce-ts of S-lunk De*elo-ment •
S-lunk de*elo-ment conce-ts
•
)oles and res-onsibilities of S-lunk De*elo-er
Module > ( Sa*ing and Scheduling Searches •
.-orting search results
•
Sa*ing and sharing search results
•
Sa*ing searches
•
Search scheduling
Module ( Creating Alerts •
Describing alerts
•
Alert Creation
•
iew 0red alerts
Module E ( 4ags and *ent 4"-es •
?nderstanding tags
•
Creating tags and using them in a search
•
De0ning e*ent t"-es and their usefulness
•
Creating and using e*ent t"-es in a search
Module F ( Search Commands
•
)e*iewing search commands and -erforming general search -ractices
•
.amine the anatom" of a search
•
?sing *arious commands to -erform searches'0elds, table, rename, re.5ere., multi-l"
Module ( )e-orting Commands •
?sing following commands and their functions' 1! to>! rare ! stats E! addcoltotals F! addtotals
Module G ( isuali/ations •
.-lore the a*ailable *isuali/ations
•
Create Charts and timecharts
•
9mit null *alues and format results
Module H ( Anal"/ing, Calculating and &ormatting )esults •
?sing e*al command
•
=erform calculations
•
alue Con*ersion
•
)ound *alues
•
&ormat *alues
•
Conditional statements
•
&iltering calculated results
Module I ( Correlating *ents •
9*er*iew of 4ransactions
•
Search 4ransactions
Module 1K ( nriching Data with Looku-s •
6hat are looku-s7
•
Looku- 0le e.am-le
•
Creating a looku- table
•
De0ning a looku-
•
Con0guring an automatic looku-
•
?sing the looku- in searches and re-orts
Module 11 ( Creating )e-orts and Dashboards •
Creating re-orts and charts
•
Creating dashboards and adding re-orts
Module 1> ( etting started with =arsing •
Data =re*iew and =arsing =hase
•
)aw Data Mani-ulation
•
.traction of &ields
=ro#ect •
•
•
4he S-lunk =ro#ect, after 0nishing this training course, will let "ou create a re-ort and dashboard with the te.t 0le ha*ing em-lo"ee details! @ou will -erform *arious row o-erations to fetch data as -er "our reuirements and use im-ortant S-lunk commands on the 0le to e.tract certain 0elds! 9ther signi0cant as-ects of this -ro#ect is editing the e*ent, adding tags, searching e*ent with tag names and sa*ing tag search!
S-lunk Admin Module 18 Sim-le S-lunk n*ironment •
+nstalling S-lunk
•
License Management
•
Data +n-uts
•
A-- management
Module >8 Basic =roduction n*ironment •
+ntroduction to S-lunk Con0guration &iles
•
?ni*ersal &orwarder
•
&orwarder Management
Module ( arious Data +n-uts •
?nderstanding Monitor +n-uts
•
6hat are Network +n-uts7
•
De0ne Modular and Scri-ted +n-uts
•
.-laining 6indows +n-uts
•
6hat are &ine8tuning +n-uts7
Module E ( +nde. and ?ser Management •
Conce-t of +nde.ing in S-lunk
•
Maintenance and 9-timi/ation of +nde.es
•
?sers' 4heir )oles and Authentication
Module F ( etting started with =arsing •
Data =re*iew and =arsing =hase
•
)aw Data Mani-ulation
•
.traction of &ields
Module ( Search Scaling and Monitoring •
=erforming Distributed Search
•
Search =erformance 4uning
•
?nderstanding .ecution issues in large scale de-lo"ment
•
Distributed Management Console
=ro#ect ( &ield .traction
=roblem Statement ( +t includes' •
About &ield .traction
•
&ield .tractor ?tilit"
•
&ield .traction -age in S-lunk 6eb
•
Con0gure 0eld e.traction in con0guration 0les etc!
A-ache Storm Module 1 ( ?nderstanding Architecture of Storm •
Ba"esian Law
•
2adoo- Distributed Com-uting
•
Big Data features
•
Legac" Architecture of )eal 4ime S"stem
•
Storm *s! 2adoo-
•
Logical D"namic and Com-onents in Storm
•
Storm 4o-olog"
•
.ecution Com-onents in Storm
•
Stream rou-ing
•
4u-le
•
S-out
•
Bolt8normali/ation bolt
Module > ( +nstallation of A-ache storm •
+nstalling A-ache Storm
Module ( rou-ing •
Dierent t"-es of rou-ing
•
)eliable and unreliable messaging
•
&etching data ( Direct connection and n8ueued message
•
Bolt Lifec"cle
Module E ( 9*er*iew of 4rident •
4rident S-outs and its t"-es
•
Com-onents and +nterface of 4rident s-out
•
4rident &unction, &ilter 5 Aggregator
Module F ( Boot Stri--ing •
4witter Boot Stri--ing
•
Detailed learning on Boot Stri--ing
•
Conce-ts of Storm
•
Storm De*elo-ment n*ironment
=ro#ects )eal8time =ro#ect on Storm 4he =ro#ect Bolt Blue =rint 2Base Module 1 (2Base 9*er*iew •
etting started with 2Base
•
Core Conce-ts of 2Base
•
?nderstanding 2Base with an .am-le
Module > (Architecture of NoSQL •
6h" 2Base7
•
6here to use 2Base7
•
6hat is NoSQL7
Module ( 2Base Data Modeling •
2D&S *s!2Base
•
2Base ?se Cases
•
Data Modeling 2Base
Module E (2Base Cluster Com-onents •
2Base Architecture
•
Main com-onents of 2Base Cluster
Module F ( 2Base A=+ and Ad*anced 9-erations •
2Base Shell
•
2Base A=+
•
=rimar" 9-erations
•
Ad*anced 9-erations
Module ( +ntegration of 2i*e with 2Base •
Create a 4able and +nsert Data into it
•
+ntegration of 2i*e with 2Base
•
Load ?tilit"
Module G ( &ile loading with both load ?tilit" •
=utting &older to M
•
&ile loading with both load ?tilit"
=ro#ect ( +ntegrate 2i*e and 3a*a with 2Base
=roblem Statement ( 4his -ro#ect describes that how to integrate hi*e and #a*a with 2Base! +t includes following actions' •
+nstallation of 2Base
•
Creation of 4able
•
3a*a =rogram to create the table in 2Base
•
Managing the 2Base 4able with 2i*e
•
Bulk +m-ort etc!
Cassandra Module 18Ad*antages and ?sage of Cassandra •
Brief +ntroduction of the course
•
Ad*antages and ?sage of Cassandra
Module >8CA= 4heorem and No SQL DataBase
•
6h" No SQL DataBase
•
)e-lication in )DBMS
•
%e" Challenges with )DBMS
•
Schema
•
No SQL;Not onl" SQL<
•
No SQL Categor"
•
Ad*antage 5Limitation
•
%e" Characteristics of No SQL Data Base
•
CA= 4heorem
•
Consistenc"
Module 8Cassandra fundamentals, Data model, +nstallation and setu•
6hat is Cassandra7
•
Non relational
•
%e" de-lo"ment conce-t
•
6hat is column oriented database
•
Data Model ( column
•
6hat is column famil"
•
+nstallation
Module E8Ste-s in Con0guration •
4oken calculation
•
Con0guration o*er*iew
•
Node tool
•
alidators
•
Com-arators
•
.-iring column
•
QA
Module F8Summari/ation, node tool commands, cluster, +nde.es, Cassandra 5 Ma-reduce, +nstalling 9-s8center •
Dierence between )elational modeling 5 Cassandra modeling
•
Ste-s in Cassandra modeling
•
4ime series modeling in Cassandra
•
Column famil"
•
Data modeling in Cassandra
•
Column famil" *s! Su-er column famil"
•
Counter column famil"
•
=artitioners
•
=artitioners strategies
•
)e-lication
•
ossi- -rotocols
•
)ead o-eration
•
Consistenc"
•
Com-arison
Module 8Multi Cluster setu•
Node settings
•
Setu- of Multinode cluster
•
)ow cache and %e" cache
•
)ead o-eration
•
S"stem ke"s-ace
•
Commands o*er*iew
•
Column famil"
•
Nodes
Module G84hriftA)93S9N2ector Client
•
4hrift
•
A)9
•
3S9N
•
2ector client
•
2ow to write a 3AA code
•
2ector tag
Module H8Datasta. installation -art,T Secondar" inde. •
Node tool commands
•
Management of Cassandra
•
Secondar" inde.
•
Cassandra 5 ma- reduce
•
Datasta. installation -art
Module I8Cassandra A=+ and Summari/ation and 4hrift •
A=+
•
+nternals of connection -ool
•
Client connecti*it" to cassandra
•
2ector client ke" features
•
2ector client ke" conce-ts
•
3a*a code
•
Summari/ation
•
4hrift
MongoDB Module 1 ( etting started with NoSQL, MongoDB and their +nstallation •
Database t"-e descri-tion
•
6hat is NoSQL Database7
•
NoSQL Database s 4"-es
•
Challenges with )DBMS
•
6h" we reuire NoSQL data7
•
6hat is M9N9DB
•
3S9NBS9N +ntroduction
•
3S9N Data 4"-es
•
.am-le of 3S9N
•
+nstallation of M9N9DB
Module > ( =art 1 ( NoSQL and its iM-ortance •
Database 4"-e
•
9L4=
•
9LA=
•
N9SQL
•
4"-e of N9SQL Database
•
Challenges with )DBMS
•
6h" N9SQL
•
AC+D -ro-ert"
•
CA= 4heorem
•
Base -ro-ert"
•
+ntroduction to 3son Bson
•
3son Data t"-es
•
Database collection 5 document
•
MongoDB use cases
•
?nacknowledged
•
Acknowledged
•
•
3uurnaled &s"nced
•
)e-ica Acknowledged
Module > ( =art > ( C)?D 9-erations •
MongoDB crud 4utorial
•
+nstallation )ent
•
used --t
•
#son its s"nta.
•
C)?D +ntroduction,
•
)ead and 6rite 9-erations
•
6rite 9-eration Concern Le*els
•
MongoDB C)?D 4utorials
•
MongoDB C)?D )eference
•
2ands on with C)?D 9-erations
Module ( =art 1 ( ?nderstanding Schema Design, Backu- strategies, Data Modeling and Monitoring •
Data Modeling in MongoDB
•
)DBMS *s! Data models
•
Data Modeling tools
•
Data modeling e.am-le 5 -atterns
•
Model 4) structure
•
9-erational strategies
•
Backu- strategies
•
Monitoring
•
Monitoring Commands
•
Monitoring of -erformance issues
•
)un time con0guration
•
.-ort 5 im-ort of data
•
)elationshi- between Document
•
Model S-eci0c A--lication Conte.ts
•
Data Model )eference
•
2ands on with MongoDB Data Modeling
Module ( =art > ( Data Administration and Management •
Data Management
•
+ntroduction to re-lica
•
lection of new -rimar"
•
)e-lica set
•
4"-e of )e-lica
•
2idden )e-lica
•
Arbiter )e-lica
•
Sharding
•
Conce-ts around )e-lication
•
Setting u- re-licated cluster
•
Setting u- Sharded Cluster
•
Sharding Database, Collections
•
2ands on .ercise
Module E ( +nde.es and Aggregation •
+ntroduction to +nde.es
•
Conce-ts around +nde.es
•
4"-e of +nde.es
•
+nde. =ro-ert"
•
+ntroduction to Aggregation
•
4"-e of Aggregation
•
?se cases of Aggregation
•
2ands on .ercise
Module F ( Securit" in MongoDB •
Securit" )isks to Databases
•
MongoDB Securit" A--roach
•
MongoDB Securit" Conce-t
•
Access Control
•
+ntegration with MongoDB with )obomongo
•
+ntegration with MongoDB with 3a*a
Module ( MongoDB +ntegration with 3as-ersoft, Load and Manage ?nstructured Data ;ideos, +mages, Logs, )esumes etc!< •
+ntegration with MongoDB with 3as-ersoft
•
Additional Conce-t ;rid&S U mongo 0les<
•
Loading and Managing ?nstructured Data ;ideos, +mages, Logs, )esumes etc!<
=ro#ect ( 3a*a MongoDB +ntegration
=roblem Statement ( +t creates a table to insert the *ideo 0le using the #a*a -rogram! &or this it -erforms following actions' •
+nstallation of 3a*a
•
Adding MongoDB 3a*a Connector etc!
A-ache S-ark Module 186h" S-ark7 .-lain S-ark and 2adoo- Distributed &ile S"stem •
6hat is S-ark
•
Com-arison with 2adoo-
•
Com-onents of S-ark
Module >8S-ark Com-onents, Common S-ark Algorithms8+terati*e Algorithms, ra-h Anal"sis, Machine Learning
•
A-ache S-ark8 +ntroduction, Consistenc", A*ailabilit", =artition
•
?ni0ed Stack S-ark
•
S-ark Com-onents
•
Com-arison with 2adoo- ( Scalding e .am-le, mahout, storm, gra-h
Module 8)unning S-ark on a Cluster, 6riting S-ark A--lications using ="thon, 3a*a, Scala •
.-lain -"thon e.am-le
•
Show installing a s-ark
•
.-lain dri*er -rogram
•
.-laining s-ark conte.t with e.am-le
•
De0ne weakl" t"-ed *ariable
•
Combine scala and #a*a seamlessl"!
•
.-lain concurrenc" and distribution!
•
.-lain what is trait!
•
.-lain higher order function with e.am-le!
•
De0ne 9&+ scheduler!
•
Ad*antages of S-ark
•
.am-le of Lamda using s-ark
•
.-lain Ma-reduce with e.am-le
Module E8)DD and its o-eration •
Dierence between )+SC and C+SC
•
De0ne A-ache Mesos
•
Cartesian -roduct between two )DD
•
De0ne count
•
De0ne &ilter
•
De0ne &old
•
De0ne A=+ 9-erations
•
De0ne &actors
Module F8S-ark, 2adoo-, and the nter-rise Data Centre, Common S-ark Algorithms •
2ow hadoo- cluster is dierent from s-ark
•
De0ne writing data
•
.-lain seuence 0le and its usefulness
•
De0ne -rotocol buers
•
De0ne te.t 0le, CS, 9b#ect &iles and &ile S"stem
•
De0ne s-arse metrics
•
.-lain )DD and Com-ression
•
.-lain data stores and its usefulness
Module 8S-ark Streaming •
De0ne lastic Search
•
.-lain Streaming and its usefulness
•
A-ache bookee-er
•
De0ne Dstream
•
De0ne ma-reduce word count
•
.-lain =arauet
•
Scala 9)M
•
De0ne Mlib
•
.-lain multi gra-hi. and its usefulness
•
De0ne -ro-ert" gra-h
Module G8S-ark =ersistence in S-ark •
=ersistence
•
Moti*ation
•
.am-le
•
4ransformation
•
Scala and ="thon
•
.am-les ( %8means
•
Latent Dirichlet Allocation ;LDA<
Module H8Broadcast and accumulator •
Moti*ation
•
Broadcast ariables
•
.am-le' 3oin
•
Alternati*e if one table is small
•
Better *ersion with broadcast
•
2ow to create a Broadcast
•
Accumulators moti*ation
•
.am-le' 3oin
•
Accumulator )ules
•
Custom accumulators
•
Another common use
•
Creating an accumulator using s-ark conte.t ob#e ct
Module I8S-ark SQL and )DD •
+ntroduction
•
S-ark SQL main ca-abilities
•
S-ark SQL usage diagram
•
S-ark SQL
•
+m-ortant to-ics in S-ark SQL8 Data frames
•
4witter language anal"sis
Module 1K89-erationsAccumulators4raits •
2ow -arallelism 4akes -lace
•
•
4he Master =arameter 3oin 9-erations .am-le
•
Accumulators
•
4raits
Module 118Scheduling=artitioning •
4ask Scheduling distribution
•
Scheduling Around A--lications
•
Static =artitioning
•
D"namic Sharing
•
Scheduling 6ithin An A--lication
•
&air Scheduling
•
2igh A*ailabilit" 9f S-ark Master
•
Standb" Masters 6ith Oookee-er
•
Single Node )eco*er" 6ith Local &ile S"stem
•
2igh 9rder &unctions
Module 1>8Ca-acit" =lanning in S-ark •
=racticals ' Creating Ma-s, 4ransformations
•
ca-acit" -lanning in s-ark
•
concurrenc" in #a*a
•
concurrenc" in scala
Module 18Log Anal"sis •
Arra" Buers
•
Com-act Buer
•
=rotocol Buer
•
Log Anal"sis 6ith S-ark
•
&irst Log Anal"/ers +n S-ark
Mini =ro#ects =ro#ect 1! List the items =ro#ect >! Sorting of )ecords =ro#ect ! Show a histogram of date *s users created! 9-tionall", use a ri ch *isuali/ation like =ro#ect E! =re-are a ma- of tags *s V of uestions in each tag and dis-la" it!
Ma#or =ro#ects =ro#ect 1 Mo*ie )ecommendation =ro#ect > 4witter A=+ +ntegration for tweet Anal"sis =ro#ect Data .-loration ?sing S-ark SQL ( 6iki-edia dataset
Scala Module 18+ntroduction of Scala •
Scala 9*er*iew
Module >8=attern Matching •
Ad*antages of Scala
•
)=L ;)ead *aluate -rint loo-<
•
Language &eatures
•
4"-e +nterface
•
2igher order function
•
9-tion
•
=attern Matching
•
Collection
•
Curr"ing
•
4raits
•
A--lication S-ace
Module 8.ecuting the Scala code •
?ses of scala inter-reter
•
.am-le of static ob#ect timer in scala
•
4esting of String eualit" in scala
•
+m-licit classes in scala with e.am-les!
•
)ecursion in scala
•
Curr"ing in scala with e.am-les!
•
Classes in scala
Module E8Classes conce-t in Scala •
Constructor
•
Constructor o*erloading
•
=ro-erties
•
Abstract classes
•
4"-e hierarch" in Scala
•
9b#ect eualit"
•
al and *ar methods
Module F8Case classes and -attern matching •
Sealed traits
•
Case classes
•
Constant -attern in case classes
•
6ild card -attern
•
ariable -attern
•
Constructor -attern
•
4u-le -attern
Module 8Conce-ts of traits with e.am-le •
3a*a eui*alents
•
Ad*antages of traits
•
A*oiding boiler-late code
•
Lineari/ation of traits
•
Modelling a real world e.am-le
Module G8Scala #a*a +ntero-erabilit" •
2ow traits are im-lemented in scala and #a*a
•
2ow e.tending multi-le traits is handled
Module H8Scala collections •
Classi0cation of scala collections
•
+terable
•
+terator and iterable
•
List seuence e.am-le in scala
Module I8Mutable collections *s! +mmutable collections •
Arra" in scala
•
List in scala
•
Dierence between list and list buer
•
Arra" buer
•
Queue in scala
•
Deueue in scala
•
Mutable ueue in scala
•
Stacks in scala
•
Sets and ma-s in scala
•
4u-les
Module 1K8?se Case bobsrockets -ackage •
Dierent im-ort t"-es
•
Selecti*e im-orts
•
4esting8Assertions
•
Scala test case8 scala test fun! Suite
•
3unit test in scala