TensorFlow

GSoC TensorFlow Part 7: Retrospective

As of August 27th, the Google Summer of Code coding phase is officially over. In this post, I look back at the summer, reviewing my accomplishments and shortcomings. Because I will continue contributing to TensorFlow and TF-Agents, I also outline my plans for the next fall.

GSoC TensorFlow Part 6: Evaluating RND on Mountain Car

reinforcement-learning gsoc tensorflow

After finishing implementing Random Network Distillation by Burda et al., now it is time to evaluate the algorithm in various hard exploration environments. I first start the evaluation on Mountain Car, a simple environment that requires extensive exploration to reach the goal state.

GSoC TensorFlow Part 5: Implementing the Core of RND

reinforcement-learning gsoc tensorflow

This week, I give a brief summary of Random Network Distillation (RND), complemented with the code I have written for TF-Agents. I then list further work needed to finish implementing RND, and plans for evaluating the algorithm once finished.

GSoC TensorFlow Part 4: First Evaluation

reinforcement-learning gsoc tensorflow

This week, I look back at the first coding phase of GSoC, summarizing my work and setting goals for the next phase.

GSoC TensorFlow Part 3: Simple Environment Wrapper with gin-config

reinforcement-learning gsoc tensorflow

This week, I implemented a simple environment wrappers to prepare myself before implementing curiosity modules.

GSoC TensorFlow Part 2: Improving Documentation

reinforcement-learning gsoc tensorflow

A great way to learn the material is to make modifications. This week, I summarize my experience of creating a pull request to TF-Agents to improve its documentation.

GSoC TensorFlow Part 1: Setting Up TF-Agents

reinforcement-learning gsoc tensorflow

I have been accepted to Google Summer of Code program to work on TensorFlow for three months. I will be working on TensorFlow's reinforcement learning library TF-Agents. In this post, I briefly summmarize the steps I took to setup the TF-Agents environment for future reference.