Commit Graph

2244 Commits

Author SHA1 Message Date
Yixiang Gao
7c2ea85bb0 Raise memory limit for CIFAR test (#1499) 2023-08-08 19:40:56 -04:00
Thiago Franco de Moraes
293a10204b Add tinygrad.renderer to packages in setup.py (#1497) 2023-08-08 15:51:49 -07:00
chenyu
0415a48cfc patch JIT llama chat mode (#1496) 2023-08-08 15:15:56 -07:00
Yixiang Gao
6480a1a180 CIFAR 94.03% (#1340)
* add disk_tensor

* fix jit

* new baseline before whitening

* whitening through torch

* whiting done currently at 91.65%

* 91.99%

* clean up mixup and 92.3%

* clean up 92.30%

* 92.49% before searching for new hyper-parameters

* fix CI

* fix white space

* add whitening init in test

* refactor, update hyperpara, 92.72%

* converting whiting to tinygrad operation

* update CI kernels count for CIFAR

* add pad reflect

* add random crop 92.53%

* update hyperpara 93%

* 93.15% on docker container, need to refactor the assignment for hyper param

* print out weights and bias to be separated

* bias/non-bias params separated

* fix whitespace

* clean up

* refactor hyper-param with dict

* refactor lr schedular params

* fix whitespace

* fix cross entropy loss

* fix whitespace

* move opt hyp to hyp dict

* minor fixup

* adjust model, loss scaling

* 92.74% while using half of compute as before

* update hyp for cutmix

* random shuffle during batches

* clean up

* updating the model

* update ConvGroup

* disable gradients for batchnorm layer weights

* whitespace

* 93.92%

* clean up

* finally 94%git add .!

* rewrite whitening to remove dependency on torch

* whitespace

* remove dependency on torch, 93.91%

* back to 94.03%

* clean up

* update test_real_world
2023-08-08 15:13:24 -07:00
Roelof van Dijk
aa83a9e910 ci: fix gpuocelot build cache (#1474)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-08 14:00:04 -07:00
George Hotz
d24f936501 just cmplt (#1493)
* just cmplt

* fix maximum

* don't save, there's no backward

* ugh, no slot either

* eq is a scam
2023-08-08 13:58:10 -07:00
Roelof van Dijk
e2cf0f322e [READY] ci: missing n=auto (#1486)
* ci: missing n=auto

* fix: add to commented test

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-08 07:37:24 -07:00
Roelof van Dijk
0ce7511110 fix: is not use with a literal (#1487)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-08 07:35:30 -07:00
nimlgen
932dad1a2b fix cast bool->float in llvmir (#1480)
Closes #1479
2023-08-07 21:30:51 -07:00
nimlgen
046fd7437a use fake buffer for external_test_speed_llama.py (#1478) 2023-08-07 22:05:44 -04:00
George Hotz
5fdd248617 don't download cifar (#1472) 2023-08-06 21:38:59 -07:00
George Hotz
d78fb8f4ed add stable diffusion and llama (#1471)
* add stable diffusion and llama

* pretty in CI

* was CI not true

* that

* CI=true, wtf

* pythonpath

* debug=1

* oops, wrong place

* uops test broken for wgpu

* wgpu tests flaky
2023-08-06 21:31:51 -07:00
terafo
24933ab551 Actually flip local_max in CUDA (#1462)
* Actually do the flip

* Fixed typo

---------

Co-authored-by: terafo <terafo@protonmail.com>
2023-08-06 10:35:25 -07:00
Diogo
d7d1011f1e Add WEBGPU tests to CI (#1463)
* webgpu tests

* assert device is webgpu

* missed env set

* exclude failing ci tests

* ignore test file

* changed acc for adam test
2023-08-06 10:32:01 -07:00
George Hotz
486a9dbfd9 speed v torch (#1464)
* speed v torch

* always print

* change print

* torch speed tee

* all exposed
2023-08-06 09:32:33 -07:00
George Hotz
2ab282bfec run on update_benchmark too (#1460)
* run on update_benchmark too

* amd inference test

* name it better

* add 10 CIFAR training steps
2023-08-06 08:58:37 -07:00
terafo
3d41674b42 Fixed regression (#1447)
Co-authored-by: terafo <terafo@protonmail.com>
2023-08-06 07:55:58 -07:00
George Hotz
d67e248d9b simple bitcast 2 (#1445)
* simple bitcast 2

* bc 2

* empty

* Revert "empty"

This reverts commit d8ee083655.
2023-08-06 00:30:50 -07:00
George Hotz
943b227cb1 only on push to master 2023-08-06 00:10:07 -07:00
George Hotz
2274e3e757 Fix benchmark (#1454)
* do benchmarking

* system

* artifact

* go

* name artifact

* only on push
2023-08-05 23:44:36 -07:00
George Hotz
bf21aec81f do benchmarking (#1451)
* do benchmarking

* system

* artifact

* go

* name artifact
2023-08-05 23:35:01 -07:00
nimlgen
1ba8ae62a1 Match Torch speed for sum reduction (#1387)
Co-authored-by: Alexander Edwards <alex@alexedw.com>
2023-08-05 22:27:33 -07:00
chenyu
09ede08b23 simplify Node.sum aggregating (#1449) 2023-08-05 22:19:36 -07:00
George Hotz
7fa730b506 external model benchmark test 2023-08-05 22:10:48 -07:00
chenyu
cb5dcc7b57 remove view_from_shape (#1448) 2023-08-05 20:39:13 -07:00
Diogo
e2af95c2f8 moved global_max and local_max to LinearizerOptions also added assert for max bufs (#1446) 2023-08-05 18:23:18 -07:00
George Hotz
7b8d06c9f1 test uops (#1444)
* test uops

* tests should pass

* improve uops

* precision
2023-08-05 12:35:56 -07:00
George Hotz
84c430355e fix backends for new style (#1443)
* fix backends for new style

* fix method cache

* fix fakeless

* llvm blacklist

* fix kernel optimizer
2023-08-05 11:07:04 -07:00
George Hotz
67781fcf5d fix fail fast in CI 2023-08-05 10:24:24 -07:00
George Hotz
bd7f4b1249 move renamer to linearizer (#1442)
* move renamer to linearizer

* uops converter

* Delete test_uops.py
2023-08-05 08:53:25 -07:00
nimlgen
669b406ec6 correct children count with lazycache (#1429) 2023-08-05 00:30:16 -07:00
Felix
97a6029cf7 Corrected a few misspelled words (#1435) 2023-08-04 16:51:08 -07:00
Adrian Kretz
043d5f2cb5 Fix NOUNROLL (#1439) 2023-08-04 16:50:19 -07:00
Francesco Castelli
579f4615a0 Add assert for wrong matmul/dot shapes (#1438) 2023-08-04 18:16:56 -04:00
Umut Zengin
52db7d7435 inf, -inf support for pad (#1436) 2023-08-04 15:05:25 -04:00
Alex Telon
7325bc914f fix: Context (#1430)
* Fixed issue in Context

* Cleaned up fix

Now that DEBUG.value = 3 always works we can do so in __new__ as well.
2023-08-04 10:53:48 -04:00
ian
c08ed1949f Fix plt output comment (#1428) 2023-08-03 23:35:52 -07:00
wozeparrot
801bed4f66 Add ops_shm (#1413)
* feat: add ops_shm

* clean: extra newline

* feat: add test

* feat: ci doesn't like that

* feat: ci still doesn't like that

* feat: skip big test on ci

* feat: testing

* feat: big

* feat: testing again

* feat: reskip test
2023-08-03 17:40:52 -07:00
chenyu
34f348643b Support constant expand to symbolic shape (#1411) 2023-08-02 21:21:22 -07:00
chenyu
6572ca6835 support symbolic expand (#1407) 2023-08-02 20:03:46 -04:00
wozeparrot
a367f71fea fix: don't put kernels into cache when optimizing (#1409) 2023-08-02 18:17:16 -04:00
Paolo Gavazzi
9ffa1eb7e2 Removed dep of torch, torchaudio, kept librosa only (#1264) 2023-08-02 13:52:04 -04:00
George Hotz
fc2303e520 gitignore in weights 2023-08-02 16:26:41 +00:00
chenyu
18d0a93f09 LazyBuffer.get_variable_buffers() (#1391)
* LazyBudder.get_variable_buffers()

* remove left_only, add ProdNode

* no vars for OpNode.b

* do not change symbolic vars, remove ProdNode
2023-08-02 09:01:35 -07:00
Umut Zengin
8889821547 Const pad support to pad2d and slice (#1392)
* slice to pad2d migrate

* Gain line

* Mypy happy

* Mypy happy

* Revert

* whitespace
2023-08-02 08:58:52 -07:00
wozeparrot
ab9e4a2e93 Make cuda CI a bit more consistent (#1403)
* feat: use fast-apt-mirror

* feat: use in more places
2023-08-02 07:38:22 -07:00
wozeparrot
7aff8c4ded cl fixes (#1402)
* feat: non-blocking

* feat: store event on buffer
2023-08-01 22:13:51 -07:00
Alex Telon
b66361843a Timing and Context can now be used as decorators (#1385)
* Context and Timing can now be used as decorators

* Using Timing decorator in quickstart.md

The time formating is better and is a useful tool to learn.

Old: Time: 3.5260659999912605
New: Time: 3526.14 ms

* Updated env_vars documentation for Context

* Added test for Context decorator

* Put new import on same line as others
2023-08-01 17:16:10 -07:00
chenyu
d9d1372dd0 Update pytest.ini format (#1398) 2023-08-01 18:00:51 -04:00
George Hotz
f4218b709f Revert "Improve Metal runtime command buffer handling (#1335)" (#1397)
This reverts commit bd54105b6b.
2023-08-01 12:10:20 -07:00