Fundamentals of Reinforcement Learning培训




21 小时 通常来说是3天,包括中间休息。


  • Experience with machine learning
  • Programming experience


  • Data scientists


Reinforcement Learning (RL) is a machine learning technique in which a computer program (agent) learns to behave in an environment by performing the actions and receiving feedback on the results of the actions. For each good action, the agent receives positive feedback, and for each bad action, the agent receives negative feedback (penalty).

This instructor-led, live training (online or onsite) is aimed at data scientists who wish to go beyond traditional machine learning approaches to teach a computer program to figure out things (solve problems) without the use of labeled data and big data sets.

By the end of this training, participants will be able to:

  • Install and apply the libraries and programming language needed to implement Reinforcement Learning.
  • Create a software agent that is capable of learning through feedback instead of through supervised learning.
  • Program an agent to solve problems where decision making is sequential and finite.
  • Apply knowledge to design software that can learn in a way similar to how humans learn.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.



  • Learning through positive reinforcement

Elements of Reinforcement Learning

Important Terms (Actions, States, Rewards, Policy, Value, Q-Value, etc.)

Overview of Tabular Solutions Methods

Creating a Software Agent

Understanding Value-based, Policy-based, and Model-based Approaches

Working with the Markov Decision Process (MDP)

How Policies Define an Agent's Way of Behaving

Using Monte Carlo Methods

Temporal-Difference Learning

n-step Bootstrapping

Approximate Solution Methods

On-policy Prediction with Approximation

On-policy Control with Approximation

Off-policy Methods with Approximation

Understanding Eligibility Traces

Using Policy Gradient Methods

Summary and Conclusion









is growing fast!

We are looking to expand our presence in China!

As a Business Development Manager you will:

  • expand business in China
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!