Site Tools


Hotfix release available: 2025-05-14a "Librarian". upgrade now! [56.1] (what's this?)
New release available: 2025-05-14 "Librarian". upgrade now! [56] (what's this?)
Hotfix release available: 2024-02-06b "Kaos". upgrade now! [55.2] (what's this?)
Hotfix release available: 2024-02-06a "Kaos". upgrade now! [55.1] (what's this?)
New release available: 2024-02-06 "Kaos". upgrade now! [55] (what's this?)
Hotfix release available: 2023-04-04b "Jack Jackrum". upgrade now! [54.2] (what's this?)
Hotfix release available: 2023-04-04a "Jack Jackrum". upgrade now! [54.1] (what's this?)
New release available: 2023-04-04 "Jack Jackrum". upgrade now! [54] (what's this?)
Hotfix release available: 2022-07-31b "Igor". upgrade now! [53.1] (what's this?)
Hotfix release available: 2022-07-31a "Igor". upgrade now! [53] (what's this?)
New release available: 2022-07-31 "Igor". upgrade now! [52.2] (what's this?)
New release candidate 2 available: rc2022-06-26 "Igor". upgrade now! [52.1] (what's this?)
New release candidate available: 2022-06-26 "Igor". upgrade now! [52] (what's this?)
Hotfix release available: 2020-07-29a "Hogfather". upgrade now! [51.4] (what's this?)
memento-value-function-approximation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
memento-value-function-approximation [2025/07/15 22:09]
20.171.207.253 old revision restored (2025/07/03 06:14)
memento-value-function-approximation [2025/07/19 18:40] (current)
216.73.216.28 old revision restored (2025/07/17 18:27)
Line 8: Line 8:
    * Fourier    * Fourier
    * ...    * ...
- 
-===Descente de gradient=== 
- 
-Avec J(w), une fonction dérivable de paramètre w (w étant un vector contenant toutes les valeurs des états). 
- 
-Le gradient de J(w) est défini sous forme matricielle, [[http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching_files/FA.pdf | voir diapo 11]] 
- 
-Permet de trouver un minimum local J(w) 
- 
-Objectif : Trouver le paramètre w qui minimise le carré de l'erreur entre la valeur approximée et la vrai valeur. 
- 
- 
-Questions :  
-   * Que représente Δw (une valeur, un vector, ...), et à quoi s'en sert-on ? 
- 
-===Représentation d'un état dans un vector=== 
- 
-Ranger dans le vector les n valeurs du même état. 
- 
-===Fonction approximation de valeur linéaire=== 
-(Linear Value Function Approximation) 
- 
-   * La descente de gradient stochastique converge vers un optimum global. 
-   * Actualisation = step-size * prediction error * feature value 
- 
-Questions :  
-   * Qu'est ce qu'on appelle une feature ? 
- 
  
  
  
  
memento-value-function-approximation.1752610178.txt.gz · Last modified: 2025/07/15 22:09 by 20.171.207.253