reset password
Author Message
rabbott
Posts: 1649
Posted 18:41 Nov 17, 2018 |

The Taxi problem as originally presented has close to 500 states. (A few of the theoretical states are not possible.) As we discussed in class today, two simple features can express the essential information in those states.

1. Passenger state: waiting for taxi, on taxi, dropped off at destination.

2. Taxi shortest-path distance to current destination. The current destination will depend on the passenger state. So the  feature extractor should determine that first.

The system should learn weights that reflect the fact that feature 1 is an order of magnitude more important than feature 2.

To determine the best action, the system tries each possible action on the current world and computes the estimated value of that resulting world.

That should be fairly straightforward.

So, I recommend that you build (or copy) a framework for working with features and weights and try it out.

Last edited by rabbott at 18:42 Nov 17, 2018.