Skip to content

[WIP] Adam learning rule#1425

Open
JesseLivezey wants to merge 3 commits into
lisa-lab:masterfrom
JesseLivezey:adam
Open

[WIP] Adam learning rule#1425
JesseLivezey wants to merge 3 commits into
lisa-lab:masterfrom
JesseLivezey:adam

Conversation

@JesseLivezey

Copy link
Copy Markdown
Contributor

This implementation is based on the arxiv [v4] paper. Haven't run or tested it yet.

The paper seems to have at least one typo in that \beta_2^t is used but never defined. I'm assuming it is just \beta_2 currently. Also assuming that \beta_{1,t} is the same thing as \beta_1^t.

@goodfeli

goodfeli commented Mar 6, 2015

Copy link
Copy Markdown
Contributor

Why not just use Alec Radford's implementation?
https://gist.github.com/Newmu/acb738767acb4788bac3

I've been using that plugged into Pylearn2 in my private repo and it works well.

@JesseLivezey

Copy link
Copy Markdown
Contributor Author

I don't think Alec's version is consistent with the most recent version of the paper, but I haven't really tested this implementation vs. his, so I'm not sure how different the results will be.

@JesseLivezey

Copy link
Copy Markdown
Contributor Author

It might just be that Alec's version doesn't decay beta1, although the betas have been redefined, and I haven't checkout to see whether the rest of the math is equivalent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants