Advanced Analytics and Real-Time Data Processing in Apache Spark

4.7( 3 REVIEWS )
Buy Now
£199 £49 (inc. VAT)
  • 365 Days
  • Advanced
  • Course Certificate
  • Wishlist
  • 04Guided Learning Hours
  • Course Material
  • 07 Number of Modules
  • Exam Included


Apache Spark is a unified analytics engine that is used in processing and analysing big data. it has started to gain recognition within large organisations for its speed, ease of use, standard interface and real-time data processing features, and could pose a great advantage in getting into the data analysis or data science field of work. If you came here wanting to learn advanced analytics and real time processing in Apache Spark, you are heading the right way with this course, as this course is set to educate you on all the aspects of Apache Spark to start processing and analysing big data.

This professionally narrated course will start off by diving into the architecture and components of Spark streaming to educate you on how it can be used in generating final data batches. You will then move onto explore the use cases of spark streaming application to use it appropriately with the suitable engine, along with an insight into the spark streaming word count problem and spark streaming API. The stressful task of managing events that are not in order while building streaming applications will also be given due attention through this course.

Out highly talented tutors will then guide you on how to create a project using the Spark’s MLlib library to provide you a more hand-on experience with the framework. You will then move onto explore the components and operations of Spark GraphX to create graphs using it for analysis purposes, followed by a chapter on SparkR and its role in distributed data frame implementation. To top it all off, you will also be taught on how to send real-time notifications when a user wants to buy a product from an e-commerce site. By the completion of this course, you will have a great grip on Apache Spark to make use of its advanced analytics and real-time data processing aspects in your career.

Why study at Global Edulink?

Global Edulink offers the most convenient path to gain recognised skills and training that will give you the opportunity to put into practice your knowledge and expertise in an IT or corporate environment. You can study at your own pace at Global Edulink and you will be provided with all the necessary material, tutorials, qualified course instructor, narrated e-learning modules and free resources which include Free CV writing pack, free career support and course demo to make your learning experience more enriching and rewarding.


  • Access Duration
  • Who is this course aimed at?
  • Entry Requirements
  • Method Of Assessment
  • Certification
  • Career Path
  • Other benefits

The course will be directly delivered to you, and you have 12 months access to the online learning platform from the date you joined the course.  The course is self-paced and you can complete it in stages, revisiting the lectures at any time.

This course might interest individuals looking to master advanced analytics and real-time data processing to get into or progress within the data analysis or data science field of work

  • Learners must be age 16 or over and should have basic understanding of the English Language, numeracy, literacy and ICT.
  • A basic knowledge of the spark programming, Apache Spark and real-time data processing is required to follow up on this course
The course is assessed online with a final, multiple-choice test, which is marked automatically. You will know instantly whether you have passed the course.
Those who pass this test will get a certificate in Advanced Analytics and Real-Time Data Processing with Apache Spark

This certificate will improve your candidature for a number of jobs in the data analysis or data science field. You can also use this certificate to prove your eligibility for the job incentives put forth by your organisation or to expand your knowledge on the area. Listed below are few of the jobs this certificate will benefit you in, along with the average UK salary per annum according to,

  • Data analyst – £25,972 per annum
  • Data scientist – £35,226 per annum
  • Data Manager – £29,986 per annum
  • Data analysis manager – £37,349 per annum
  • Data engineer – £41,223 per annum
  • High-quality e-learning study materials and mock exams.
  • Tutorials/materials from the industry leading experts.
  • 24/7 Access to the Learning Portal.
  • The benefit of applying for TOTUM extra Discount Card.
  • Recognised Accredited Qualification.
  • Excellent customer service and administrative support

Key Features

Gain an Accredited UK Qualification

Access to Excellent Quality Study Materials

Learners will be Eligible for TOTUM Discount Card

Personalised Learning Experience

UK Register of Learning Providers Reg No : 10053842

Support by Phone, Live Chat, and Email

Course Curriculum

Getting Started
Online Training User Manual 00:00:00
E Certificate Download Guide 00:00:00
Section 1: Spark Streaming
1.1. The Course Overview 00:00:00
1.2. Introducing Spark Streaming 00:00:00
1.3. Streaming Context 00:00:00
1.4. Processing Streaming Data 00:00:00
1.5. Use Cases 00:00:00
1.6. Spark Streaming Word Count Hands-On 00:00:00
1.7. Spark Streaming – Understanding Master URL 00:00:00
1.8. Integrating Spark Streaming with Apache Kafka 00:00:00
1.9. mapWithState Operation 00:00:00
1.10. Transform and Window Operation 00:00:00
1.11. Join and Output Operations 00:00:00
1.12. Output Operations -Saving Results to Kafka Sink 00:00:00
Section 2: Advance Streaming and Use Cases
2.1. Handling Time in High Velocity Streams 00:00:00
2.2. Connecting External Systems That Works in At Least Once Guarantee – Deduplicaion 00:00:00
2.3. Building Streaming Application -Handling Events That Are Not in Order 00:00:00
2.4. Filtering Bots from Stream of Page View Events 00:00:00
Section 3: Spark MLlib and ML Pipelines
3.1. Introducing Machine Learning with Spark 00:00:00
3.2. Feature Extraction and Transformation 00:00:00
3.3. Transforming Text into Vector of Numbers – ML Bag-of-Words Technique 00:00:00
3.4. Logistic Regression 00:00:00
3.5. Model Evaluation 00:00:00
3.6. Clustering 00:00:00
3.7. Gaussian Mixture Model 00:00:00
3.8. Principal Component Analysis and Distributing the Singular Value Decomposition (SVD) 00:00:00
3.9. Collaborative Filtering – Building Recommendation Engine 00:00:00
Section 4: Spark GraphX
4.1. Introducing Spark GraphX – How to Represent a Graph? 00:00:00
4.2. Limitations of Graph-Parallel System – Why Spark GraphX? 00:00:00
4.3. Importing GraphX 00:00:00
4.4. Create a Graph Using GraphX and Property Graph 00:00:00
4.5. List of Operators 00:00:00
4.6. Perform Graph Operations Using GraphX 00:00:00
4.7. Triplet View 00:00:00
Section 5: Performing Spark GraphX Operations
5.1. Perform Subgraph Operations 00:00:00
5.2. Neighbourhood Aggregations – Collecting Neighbours 00:00:00
5.3. Counting Degree of Vertex 00:00:00
5.4. Caching and Uncaching 00:00:00
5.5. GraphBuilder 00:00:00
5.6. Vertex and Edge RDD 00:00:00
5.7. Structural Operators – Connected Components 00:00:00
Section 6: SparkR
6.1. Introduction to SparkR and How It’s Used? 00:00:00
6.2. Setting Up from RStudio 00:00:00
6.3. Creating Spark DataFrames from Data Sources 00:00:00
6.4. SparkDataFrames Operations – Grouping, Aggregation 00:00:00
6.5. Run a Given Function on a Large Dataset Using dapply or dapplyCollect 00:00:00
6.6. Running Large Dataset by Input Column(s) and Using gapply or gapplyCollect 00:00:00
6.7. Run Local R Functions Distributed Using spark.lapply 00:00:00
6.8. Running SQL Queries from SparkR 00:00:00
Section 7: Analytical Use Cases
7.1. PageRank Using Spark GraphX 00:00:00
7.2. Sending Real-Time NotificationWhen User Want to Buy a Product on the E-Commerce Site 00:00:00
Mock Exam
Mock Exam : Advanced Analytics and Real-Time Data Processing in Apache Spark 00:40:00
Exam : Advanced Analytics and Real-Time Data Processing in Apache Spark 00:40:00

Students feedback


Avarage rating (3)
5 Star
4 Star
3 Star
2 Star
1 Star
    M C

    Maisie Chambers

    June 05, 2019 - 7:33pm
    Very satisfied

    I became well-versed by the end of the course in real-time analytics and learned to implement using Apache Spark.

    F B

    Francis Brooks

    May 24, 2019 - 4:32pm

    I learned how to apply Apache Spark and to perform efficient data analysis and processing in real-time. It was an informative course, which I feel will be very helpful to my job.

    A D

    Alexis Day

    May 08, 2019 - 6:25pm
    A great platform

    I felt the tutor was highly experienced and offered guidelines on how to use the Spark’s MLlib library. It was a real hands-on experience.

Buy Now
£199 £49 (inc. VAT)
  • 365 Days
  • Advanced
  • Course Certificate
  • Wishlist
  • 04Guided Learning Hours
  • Course Material
  • 07 Number of Modules
  • Exam Included
WhatsApp chat