reset password
Author Message
jpatel77
Posts: 44
Posted 03:45 Oct 15, 2018 |

Tried to write the code for Pong-v0 env, to let the agent self play, using trajectory calculation. Definitely not the greatest code :d. But does fair enough job.

Attachments:
Last edited by jpatel77 at 12:14 Oct 20, 2018.
rabbott
Posts: 1649
Posted 23:31 Oct 17, 2018 |

Running your Pong right now. Very good. Score tied 6 - 6. As I was watching, the program just quit without warning, notice, or explanation.

Last edited by rabbott at 23:32 Oct 17, 2018.
jpatel77
Posts: 44
Posted 08:28 Oct 18, 2018 |

Yes I faced the same issue, it somehow crashed, though I never ran it for more than 8 rounds. It missed sometimes I think because of the rounding effect during slope calculation, I thought it would improve if I take 1st and 3rd (or 4th maybe) coordinates of the puck instead of 2 consecutive ones, Or recalibrate the target after every 15-20 observations. Never really tried it though.

Last edited by jpatel77 at 15:47 Oct 18, 2018.
rabbott
Posts: 1649
Posted 15:24 Oct 18, 2018 |

The slope seems very constrained: 0, +/-1, or +/-2. Those seem to be the only slopes.

jpatel77
Posts: 44
Posted 01:35 Oct 19, 2018 |

Exactly and that’s because it only looks at 2 consecutive observations. So even if let say the real slope for a particular trajectory is 1.5, it’s first and third coordinates would be (0,0) and (2,3). However, the coordinates for second observation in this case should theoretically be (1,1.5). But it has to be a whole number in order to plot it on canvas thus it becomes either 1 or 2, which ultimately calculates inaccurate slope i.e. 1 or 2 (the rounding effect). I will update the code and try this soon.