Course Outline

Introduction to Stratio Platform

  • Overview of Stratio architecture and core modules
  • Role of Rocket and Intelligence in the data lifecycle
  • Logging in and navigating the Stratio UI

Working with the Rocket Module

  • Data ingestion and pipeline creation
  • Connecting data sources and configuring transformations
  • Using PySpark for preprocessing tasks in Rocket

PySpark Essentials for Stratio Users

  • PySpark data structures and operations
  • Looping constructs: for, while, if/else usage
  • Writing custom functions with def and applying them

Advanced Usage of Rocket with PySpark

  • Streaming ingestion and transformations
  • Using loops and functions in batch and real-time scenarios
  • Best practices for performance in PySpark pipelines

Exploring the Intelligence Module

  • Overview of data modeling and analysis features
  • Feature selection, transformation, and exploration
  • Role of PySpark in custom analytics and insights

Building Advanced Analytics Workflows

  • Creating user-defined functions (UDFs) in Intelligence
  • Applying conditionals and loops for data logic
  • Use cases: segmentation, aggregation, and prediction

Deployment and Collaboration

  • Saving, exporting, and reusing workflows
  • Collaborating with other team members on Stratio
  • Reviewing output and integrating with downstream tools

Summary and Next Steps

Requirements

  • Experience with Python programming
  • Understanding of data analytics or big data processing concepts
  • Basic knowledge of Apache Spark and distributed computing

Audience

  • Data engineers working on Stratio-based platforms
  • Analysts or developers using Rocket and Intelligence modules
  • Technical teams transitioning to PySpark workflows within Stratio
 14 Hours

Testimonials (5)

Upcoming Courses

Related Categories