Highest Rated Comments
-Ulkurz-2 karma
Aren't both of these examples related to learning in a simulated environment? Any use cases around offline/batch RL?
-Ulkurz-2 karma
Aren't both of these examples related to learning in a simulated environment? Any use cases around offline/batch RL?
-Ulkurz-30 karma
Thank you for doing this AMA! My question is around applying RL for real-world problems. As we already know, oftentimes it's difficult to build a simulator or a digital twin for most real-world processes or environments, which kind of nullifies the idea of using online RL.
But this is where offline/batch RL can be helpful in terms of using large datasets collected via some process, from which a policy can be learned offline. We've already seen a lot of success in a supervised learning setting where an optimal model is learned offline from large volumes of data.
Although there has been a lot of fundamental research around offline/batch RL, I have not seen much real-world applications. Could you please share some of your own experiences around this, if possible, with some use cases related to the application of batch/offline RL in the real-world? Thanks!
View HistoryShare Link