感谢您发送咨询!我们的团队成员将很快与您联系。
感谢您发送预订!我们的团队成员将很快与您联系。
课程大纲
Introduction to AIOps with Open Source Tools
- Overview of AIOps concepts and benefits
- Prometheus and Grafana in the observability stack
- Where ML fits in AIOps: predictive vs. reactive analytics
Setting Up Prometheus and Grafana
- Installing and configuring Prometheus for time series collection
- Creating dashboards in Grafana using real-time metrics
- Exploring exporters, relabeling, and service discovery
Data Preprocessing for ML
- Extracting and transforming Prometheus metrics
- Preparing datasets for anomaly detection and forecasting
- Using Grafana’s transformations or Python pipelines
Applying Machine Learning for Anomaly Detection
- Basic ML models for outlier detection (e.g., Isolation Forest, One-Class SVM)
- Training and evaluating models on time series data
- Visualizing anomalies in Grafana dashboards
Forecasting Metrics with ML
- Building simple forecasting models (ARIMA, Prophet, LSTM intro)
- Predicting system load or resource usage
- Using predictions for early alerting and scaling decisions
Integrating ML with Alerting and Automation
- Defining alert rules based on ML output or thresholds
- Using Alertmanager and notification routing
- Triggering scripts or automation workflows on anomaly detection
Scaling and Operationalizing AIOps
- Integrating external observability tools (e.g., ELK stack, Moogsoft, Dynatrace)
- Operationalizing ML models in observability pipelines
- Best practices for AIOps at scale
Summary and Next Steps
要求
- An understanding of system monitoring and observability concepts
- Experience using Grafana or Prometheus
- Familiarity with Python and basic machine learning principles
Audience
- Observability engineers
- Infrastructure and DevOps teams
- Monitoring platform architects and site reliability engineers (SREs)
14 小时