* no trailing whitespace
* GPU MaxPool2D.backward(); TinyConvNet train passes!
* Fix GPU avgpool.forward() init_val
Doesn’t change result but is simpler.
* Fix MaxPool GPU init_val
Tests only cover random non-negative inputs. This fixes issues if negative inputs are fed to GPU MaxPool2D. Test update to follow.
* to make it work locally
* definitely not working
* Conv2D GPU passes some of the tests
* Conv2D GPU passes more of the tests
* passes some tests and mnist
* removed unecessary code
* Conv2D Backpass works
* wrong test_ops.py
* white space + test backward
* ereased useless code
* removed default argument
* long lines
Strided CPU Pooling was introduced but assumes small kernel size
(<=(10,10)), but efficientnet.py feeds kernel_size=(112,112).
This causes a huge array buffer allocation in stack_for_pool() that
hangs inference for a long time or until system OOM.
Revert CPU Pooling for now, and re-introduce #74 later with a new
global-average-pooling op that can be used instead of avgpool2d with
large kernel size for efficientnet inference.
Co-authored-by: Ryan Neph <ryanneph@google.com>
* copy tensors to and from gpu
* add on GPU
* adding works
* we stick shapes in
* works on cpu and gpu
* test changes, not passing yet
* something else
* op tests pass
* add, mean, and sum have working forward/backward
* mul ops test
* no gpu support, no problem
* test pass, clean up later
* gpu cleanup
* cleanup test ops, don't let div fail
* revert more
* aimpler dispatcher
* clean up grad
* GPU and
* grad is a Tensor now
* gate test on GPU
* cleanups
* late loading gpu
* GPU as input option
* last cleanups