thosehippos2 karma2021-03-24 16:07:11 UTC
On the note of exploration: Even if we were able to get provably correct exploration strategies from tabular learning (like r-max) to work in function approximation settings, it seems like the number of states to explore in a real-ish domain is to high to exhaustively explore. How do you think priors play into this, especially with respect to provability and guarantees?
View HistoryShare Link
thosehippos1 karma2021-03-24 16:56:37 UTC
- Inductive Bias: Awesome! Thanks!
- https://arxiv.org/pdf/1911.05815.pdf (edit: will read this in more detail! Very interesting!): Block MDPs like the ones used in your paper (and extending current work beyond them) is of particular interest to me. I also have some work on latent state learning in Block MDPs (https://arxiv.org/pdf/2006.03465.pdf) focusing on generalization capability.
Do you have thoughts on what assumptions from Block MDPs (ex: uniqueness of underlying state based on observation) are reasonable in realistic tasks and which are potentially limiting?
- Go-Explore/State Abstraction: That's very true, I haven't thought of it that way before. I'm trying to determine if there exists some general representation function (like image downsampling) that's "good enough" for a set of tasks (ex: household robotics or atari games) or whether we need to learn task-specific representations. I suppose this is somewhat in line with a generalization vs adaptation argument
Copyright © 2014 BestofAMA.com, All rights reserved.
reddit has not approved or endorsed BestofAMA, reddit design elements are trademarks of reddit inc.