Author | Message |
---|---|
rabbott
Posts: 1649
|
Posted 14:30 Dec 01, 2018 |
Look at the extract from one of the Berkeley slides (attached below). The equation in the blue rectangle shows the definition of Each feature is shown as a function of a state and an action. In reality, especially for the problems we are working on, we "factor out" the action and can think of the equation on the slide as if it read:
where In state-based q-learning, one updates one's estimate of A couple of things are worth emphasizing. 1. Q(s, a) includes the reward for taking action 2. Features are extracted from the states without regard to the action that one might take in those states. In other words, a feature is a function of a state alone, whereas The features are intended to be abstractions that isolate the important features of states. For example, in the taxi problem a feature may whether the passenger is on the taxi. That feature applies, either true or false, to many different states. It identifies an important feature of all of them.
Last edited by rabbott at
19:53 Dec 01, 2018.
|
rabbott
Posts: 1649
|
Posted 19:56 Dec 01, 2018 |
The last two lines of the previous post are not correct. (I left them but put a strike-through line through them because some people may already have seen them.) The problem is that an action that takes a set of features that includes, for example, Last edited by rabbott at
19:58 Dec 01, 2018.
|