Site Tools


Hotfix release available: 2024-02-06b "Kaos". upgrade now! [55.2] (what's this?)
Hotfix release available: 2024-02-06a "Kaos". upgrade now! [55.1] (what's this?)
New release available: 2024-02-06 "Kaos". upgrade now! [55] (what's this?)
Hotfix release available: 2023-04-04b "Jack Jackrum". upgrade now! [54.2] (what's this?)
Hotfix release available: 2023-04-04a "Jack Jackrum". upgrade now! [54.1] (what's this?)
New release available: 2023-04-04 "Jack Jackrum". upgrade now! [54] (what's this?)
Hotfix release available: 2022-07-31b "Igor". upgrade now! [53.1] (what's this?)
Hotfix release available: 2022-07-31a "Igor". upgrade now! [53] (what's this?)
New release available: 2022-07-31 "Igor". upgrade now! [52.2] (what's this?)
New release candidate 2 available: rc2022-06-26 "Igor". upgrade now! [52.1] (what's this?)
New release candidate available: 2022-06-26 "Igor". upgrade now! [52] (what's this?)
Hotfix release available: 2020-07-29a "Hogfather". upgrade now! [51.4] (what's this?)
memento-value-function-approximation

This is an old revision of the document!


Problème : Lorsque le MDP devient trop important (trop d'états, trop d'actions), l'apprentissage devient lent.

Solution : Estimer la fonction de valeur avec une fonction approximation

Il existe plusieurs approximateurs :

  • Réseau de neurones
  • Arbre de décision
  • Fourier
  • ...

Descente de gradient

Avec J(w), une fonction dérivable de paramètre w.

Le gradient de J(w) est défini sous forme matricielle, voir diapo 11

Permet de trouver un minimum local J(w)

memento-value-function-approximation.1740333120.txt.gz · Last modified: 2025/02/23 18:52 by 47.128.63.79