About This Course
Who this course is for:
- Anyone with basic Python skills desiring to start in Reinforcement Learning
- Experienced AI Engineers, ML Engineers, Data Scientist, and Software Engineers wanting to apply Reinforcement Learning to real business problems
- Business professionals willing to learn how Reinforcement Learning can help with automating adaptive decision making processes
What you’ll learn:
- Understand and be able to identify Multi-Armed Bandit (MAB) problems
- Model real business problems as MAB and implement digital AI agents to automate them
- Understand the challenge of Reinforcement Learning regarding the exploration-exploitation dilemma
- Practical implementation of the various algorithmic strategies for balancing between exploration and exploitation
- Python implementation of the Epsilon-greedy strategy
- Python implementation of the Softmax Exploration strategy
- Python implementation of the Optimistic Initialization strategy
- Python implementation of the Upper Confidence Bounds (UCB) strategy
- Understand the challenges of Reinforcement Learning in terms of the design of reward functions and sample efficiency
- Estimation of action values through incremental sampling
- Be able to understand basic OOP programs in Python
- Have basic Numpy and Matplotlib knowledge
- Basic algebra skills
Software version used in the course:
- Python 3.9.5.
With very concise explanations, this course teaches you how to confidently translate seemingly scary mathematical formulas into Python code painlessly. We understand that not many of us are technically adept in the subject of mathematics so this course intentionally stays away from maths unless it is necessary. And even when it becomes necessary to talk about mathematics, the approach taken in this course is such that anyone with basic algebra skills can understand and most importantly easily translate the maths into code and build useful intuitions in the process.
Some of the algorithmic strategies taught in this course are Epsilon Greedy, Softmax Exploration, Optimistic Initialization, Upper Confidence Bounds, and Thompson Sampling. With these tools under your belt, you are adequately equipped to readily build and deploy AI agents that can handle critical business operations under uncertainties.
Our Promise to You
By the end of this course, you will have learned to create multi-armed bandit algorithms.
30 Day Money Back Guarantee. If you are unsatisfied for any reason, simply contact us and we’ll give you a full refund. No questions asked.
Get started today and learn more about Python programming.
|Section 1 - Introduction And Course Lessons|
|Introduction To Reinforcement Learning And Multi-Armed Bandit Problems||00:00:00|
|Implementing Simulated MAB Environments In Python||00:00:00|
|Estimating Action Values Through Sampling||00:00:00|
|Implementing Incremental Average In Code||00:00:00|
|Implementing Incremental Average For Non-Stationary Bandits||00:00:00|
|Building A Baseline Agent That Behaves Randomly||00:00:00|
|Why Are The Results Not Repeatable?||00:00:00|
|Implementing And Analysing A Greedy Agent||00:00:00|
|Balancing Exploration And Exploitation With Epsilon Greedy Agents||00:00:00|
|Controlling Exploration With A Decay||00:00:00|
|Exploring Intelligently With Softmax Exploration||00:00:00|
|Being Optimistic Under Uncertainties||00:00:00|
|Realistic Optimism Under Uncertainties||00:00:00|