convoluted connor @notwa@witches.town
Suivre

@hjkl side note, the momentum boosting learning rate thing is my own idea; i'm not sure how well it holds in practice. but when you consider the momentum equation as an LTI system, you see its magnitude plot has a gain at DC proportional as i stated.

for fun, i've tried implementing a second-order filter as an optimizer, but i couldn't personally manage anything better than a traditional well-tuned momentum optimizer.