Research/Machine Learning
강화학습, Exponentially weighted average계산하기
IMCOMKING
2019. 1. 22. 13:03
https://gist.github.com/imcomking/b1acbb891ac4baa69f32d9eb4c221fb9
def exponentially_weighted_matrix(discount, mat_len):
DisMat = np.triu(np.ones((mat_len, mat_len)) * discount, k=1)
DisMat[DisMat==0] = 1
DisMat = np.cumprod(DisMat, axis=1)
DisMat = np.triu(DisMat)
return DisMat
def exponentially_weighted_cumsum(discount, np_data):
DisMat = exponentially_weighted_matrix(discount, np_data.shape[0])
value = np.dot(DisMat, np_data.reshape(-1, 1))
return value[::-1].transpose()[0]
# 강화학습 팁
Continuous action space의 문제를 풀때는 배치사이즈가 32, 64의 수준이아니라 512, 1024 수준으로 매우 커야한다.
# Log-Derivative Trick
https://talkingaboutme.tistory.com/entry/RL-Spinning-Up-Intro-to-Policy-Optimization