43 Commits

Author SHA1 Message Date
Yixiang Gao
9d93a82354 remove FAKEDATA (#1685) 2023-08-26 20:15:54 -04:00
Yixiang Gao
173850f599 fix CIFAR jit (#1657)
* update mask function

* kept 94 with the new fetcher

clean up batch fetcher

* 94.04% without cutmix

* 94.04% with cutmix

* move batch fetcher to avoid fetching additional batch last STEP
2023-08-24 16:14:40 -07:00
George Hotz
a6d842af7a move device to ops (#1646)
* move device to ops

* mlops types

* 2 lines
2023-08-23 08:30:17 -07:00
George Hotz
718ced296c move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
Umut Zengin
f720682beb np.argmax to Tensor.argmax (#1608)
* to tensor argmax

* removed keepdim

* training update
2023-08-21 15:22:29 -07:00
wozeparrot
50decf0d45 train cifar using multigpu (#1529)
* feat: train cifar using multigpu

* feat: split eval batch across 5

* feat: cleaner allreduce

* feat: 93.88%

* feat: cleaner batch chunking from bert

* feat: cleaner grad sync

* feat: tinygrad argmax

* feat: make it work with different gpu counts

* feat: move some stuff into the normal __init__

* feat: autodetect gpu count

* feat: move import inside
2023-08-18 09:35:44 -07:00
Yixiang Gao
6480a1a180 CIFAR 94.03% (#1340)
* add disk_tensor

* fix jit

* new baseline before whitening

* whitening through torch

* whiting done currently at 91.65%

* 91.99%

* clean up mixup and 92.3%

* clean up 92.30%

* 92.49% before searching for new hyper-parameters

* fix CI

* fix white space

* add whitening init in test

* refactor, update hyperpara, 92.72%

* converting whiting to tinygrad operation

* update CI kernels count for CIFAR

* add pad reflect

* add random crop 92.53%

* update hyperpara 93%

* 93.15% on docker container, need to refactor the assignment for hyper param

* print out weights and bias to be separated

* bias/non-bias params separated

* fix whitespace

* clean up

* refactor hyper-param with dict

* refactor lr schedular params

* fix whitespace

* fix cross entropy loss

* fix whitespace

* move opt hyp to hyp dict

* minor fixup

* adjust model, loss scaling

* 92.74% while using half of compute as before

* update hyp for cutmix

* random shuffle during batches

* clean up

* updating the model

* update ConvGroup

* disable gradients for batchnorm layer weights

* whitespace

* 93.92%

* clean up

* finally 94%git add .!

* rewrite whitening to remove dependency on torch

* whitespace

* remove dependency on torch, 93.91%

* back to 94.03%

* clean up

* update test_real_world
2023-08-08 15:13:24 -07:00
George Hotz
f7b0320d8b add cifar training regression test (#1287)
* add cifar training regression test

* clean up print
2023-07-19 14:17:09 -07:00
Yixiang Gao
a8f2c16f8e add contiguous (#1246) 2023-07-15 08:36:34 -07:00
terafo
aa60feda48 Fix naming conflict with huggingface datasets (#1161)
* Rename in files

* Move files

* Moved to extra/datasets as suggested

* Changes to files

* Fixed stupid mistake

---------

Co-authored-by: terafo <terafo@protonmail.com>
2023-07-07 10:43:44 -07:00
Kunwar Raj Singh
8391648822 Over 90% on CIFAR with examples/hlb_cifar10.py (#1073)
* fix eval, lr decay, best eval

* 82.27

* 82.64

* 82.79, reproducable

* add lr sched, 85.26

* 87.42

* 87.94

* 87.42

* tta with flip

* training flip aug

* refactor

* using Tensor for LR is faster

* 89.5

* refactor, flip only train set

* 90.01

* 90.64

* eval jit

* refactor

* only JIT model

* fix eval JIT

* fix eval JIT

* 90.82

* STEPS=900 reaches 90.22

* TTA envvar

* TTA default 0

* fully jit training

* refactor optim

* fix sched

* add label smoothing

* param changes

* patial gelu

* OneCycle with pause

* gelu maybe works

* 90.12

* remove pause lr

* maybe fix lr schedulers

* scheduler test passing

* comments

* try mixup

* shuffle!

* add back the missing last eval

* fix shuffle bugs

* add mixup prob

* fix mixup prob

* 90.19

* correct mixup

* correct mixup

* correct mixup

* 90.24

* 90.33

* refactor, add type hints

* add gradient clipping

* maybe fix test

* full JIT

* back to relu for now

* pass mixup prob as param

* add typehints

* maybe CI works

* try erf gelu

* CI, types

* remove useless import/

* refactor optim

* refactor optim

* try leakyrelu

* try celu

* gelu

* 90.67

* remove grad clip

* remove grad clip tests

* revert params

* add test for OneCycleLR

* 90.62

* fix eval timing

* fix eval timing again

* so where i calculate mixup_prob matters

---------

Co-authored-by: Kunwar Raj Singh <kunwar31@pop-os.localdomain>
2023-07-06 20:46:22 -07:00
Reza Rezvan
8ae9a054ae Refactor nn.optim (#1091)
* Refactor: nn.optim.py

* Refactor: nn.optim.py; Fix all tests

* Refactor: Replace all optim.get_parameters()

* Refactor: Revert list comp.

* Refactor: Replace optim.get_state_dict

* Refactor: Change quickstart.md
2023-07-02 15:07:30 -07:00
George Hotz
8a928ed2f3 nn init matches torch (#901) 2023-06-01 21:24:11 -07:00
George Hotz
1b42b4e1b8 fix examples/hlb_cifar10.py 2023-06-01 19:03:17 -07:00
George Hotz
e4db0c820f hlb_cifar10 init from torch weights 2023-04-18 19:09:13 -07:00
George Hotz
732884653c osx in hlb_cifar10_torch 2023-04-14 13:12:08 -07:00
George Hotz
584ee6f616 don't graph consts 2023-04-14 03:32:20 -07:00
George Hotz
9a39ebefde hlb_cifar10_torch gets 80% 2023-04-14 02:47:03 -07:00
voidz
94bec40110 moved extras/jit.py -> tinygrad/jit.py (#599)
* moved extras/jit.py to tinygrad/jit.py

* fixed indent

* removed tinygrad.helpers.DEBUG from jit.py
2023-02-25 08:32:33 -08:00
George Hotz
2e56a4793e rename log_softmax, support dim, fix onnx Softmax 2023-02-24 10:11:24 -08:00
Jacky Lee
9fd41632c6 Import get_parameters from tinygrad.nn (#559)
* get_parameter is in optim

* Update all imports for get_parameters

* Clean up

* use optim.get_paramters
2023-02-17 15:22:26 -08:00
Jacky Lee
e172f0087a BatchNorm2D -> BatchNorm2d (#558)
* BatchNorm2D -> BatchNorm2d

* Fix typo
2023-02-16 12:31:49 -08:00
George Hotz
9057d98d36 no lr decay in cifar. test this in torch tomorrow 2023-02-11 17:42:54 -08:00
George Hotz
dd7accb9cc decay LR, little bugfix 2023-02-11 17:34:15 -08:00
George Hotz
ba3bf5bdf7 cifar stops learning 2023-02-11 17:21:42 -08:00
George Hotz
9152bb5b4a momentum support in SGD 2023-02-11 10:22:37 -08:00
George Hotz
031edd01e6 switch openpilot compile to TinyJit 2023-02-11 09:51:44 -08:00
George Hotz
608fd730d3 put the JIT in extra 2023-02-11 00:35:18 -06:00
George Hotz
ed8ae7522a tinyjit 2023-02-11 00:22:36 -06:00
George Hotz
4c90a15689 make the fake data actually learnable 2023-02-10 23:35:21 -06:00
George Hotz
07629d7476 fakedata and move to new cache 2023-02-10 23:32:31 -06:00
George Hotz
63fa7daf30 wrong place for CL 2023-02-10 23:22:24 -06:00
George Hotz
fed95119dc CL.mem_used -> GlobalCounters.mem_used 2023-02-10 23:13:29 -06:00
Jacky Lee
f08187526f Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
f7291f6ca3 fixes big KOPT, breaks opencl (#505)
* fixes big KOPT, breaks opencl

* fix optimizer

* KernelCache

* oops, broke batchnorm

* hack to fix it

* fix llvm, less hacky gpu

* disable the cache

* cache just breaks things
2023-02-05 10:46:17 -08:00
James Roberts
db0a9b0a2d Refactor CL.time_sum into GlobalCounters (#519) 2023-02-01 20:13:56 -08:00
George Hotz
21f2af08d5 getenv + graphing 2023-01-30 19:15:03 -08:00
George Hotz
60ccddb58b reenable SWAP 2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196 found failing upcast 2023-01-30 16:12:56 -08:00
George Hotz
7ee0d99c70 CLCACHE 2023-01-30 14:02:06 -08:00
George Hotz
cccfea4b25 factor out KOPT code 2023-01-30 13:13:55 -08:00
George Hotz
de2c419fd4 make_pair and first attempt at hlb_cifar10 2023-01-30 11:07:23 -08:00