reset password
Author Message
rabbott
Posts: 1649
Posted 11:06 Nov 15, 2018 |

My initial impression in talking to project teams yesterday is that most teams need quite a bit of help with their projects. Given Veteran's Day and the Thanksgiving week holiday, we have very little time to talk.

School will be open, but with no classes, Monday and Tuesday of next week. I am willing to come in and talk to you about your project on one or another of those days. Let's make it 10:00am, the usual class time, in the usual class room. Please let me know if you want to come in to talk. (Otherwise, there is no point in my making the trip.) So please let me know by 6pm Sunday.

Thanks.

Comments on some of the projects.

Use features.  A feature-based approach is required for all projects. To review how a feature-based approach works, look at the Pacman assignment.

Pacman - capture the flag. The code as given includes classes for a defensive player and an offensive player. A good place to begin is to understand how these two player types are coded. What makes one of them defensive and the other offensive?

Cart-Pole. I recommend that you copy the environment and use it to predict the effect of an action. You can then apply your feature extractor to the computed results of possible action to see how well they do.

With respect to features, you might start by defining the features more or less as returned by the environment: cart position, cart velocity, pole angle, pole rotational velocity. But since the magnitude rather than their sign of these numbers is what matters, instead of using the numbers as given, you might take their absolute value as your features . Alternatively, instead of taking the absolute value, you might square the numbers. That will both eliminate the sign and magnify the difference between numbers close to zero (which is good) and numbers above 1 (which is bad). The magnification might make it easier for the system to find useful weights.

Since big numbers are bad, expect the weights to be less than zero. Alternatively, if you want positive weights, since the magnitude of the observations are all less than 3, for each number, your feature might be:  3 - abs(number). (If you square the number, you may have to use a constant larger than 3 to subtract from.)

 

Last edited by rabbott at 11:09 Nov 15, 2018.