Commit Graph

2229 Commits

Author SHA1 Message Date
George Hotz
2ab282bfec run on update_benchmark too (#1460)
* run on update_benchmark too

* amd inference test

* name it better

* add 10 CIFAR training steps
2023-08-06 08:58:37 -07:00
terafo
3d41674b42 Fixed regression (#1447)
Co-authored-by: terafo <terafo@protonmail.com>
2023-08-06 07:55:58 -07:00
George Hotz
d67e248d9b simple bitcast 2 (#1445)
* simple bitcast 2

* bc 2

* empty

* Revert "empty"

This reverts commit d8ee083655.
2023-08-06 00:30:50 -07:00
George Hotz
943b227cb1 only on push to master 2023-08-06 00:10:07 -07:00
George Hotz
2274e3e757 Fix benchmark (#1454)
* do benchmarking

* system

* artifact

* go

* name artifact

* only on push
2023-08-05 23:44:36 -07:00
George Hotz
bf21aec81f do benchmarking (#1451)
* do benchmarking

* system

* artifact

* go

* name artifact
2023-08-05 23:35:01 -07:00
nimlgen
1ba8ae62a1 Match Torch speed for sum reduction (#1387)
Co-authored-by: Alexander Edwards <alex@alexedw.com>
2023-08-05 22:27:33 -07:00
chenyu
09ede08b23 simplify Node.sum aggregating (#1449) 2023-08-05 22:19:36 -07:00
George Hotz
7fa730b506 external model benchmark test 2023-08-05 22:10:48 -07:00
chenyu
cb5dcc7b57 remove view_from_shape (#1448) 2023-08-05 20:39:13 -07:00
Diogo
e2af95c2f8 moved global_max and local_max to LinearizerOptions also added assert for max bufs (#1446) 2023-08-05 18:23:18 -07:00
George Hotz
7b8d06c9f1 test uops (#1444)
* test uops

* tests should pass

* improve uops

* precision
2023-08-05 12:35:56 -07:00
George Hotz
84c430355e fix backends for new style (#1443)
* fix backends for new style

* fix method cache

* fix fakeless

* llvm blacklist

* fix kernel optimizer
2023-08-05 11:07:04 -07:00
George Hotz
67781fcf5d fix fail fast in CI 2023-08-05 10:24:24 -07:00
George Hotz
bd7f4b1249 move renamer to linearizer (#1442)
* move renamer to linearizer

* uops converter

* Delete test_uops.py
2023-08-05 08:53:25 -07:00
nimlgen
669b406ec6 correct children count with lazycache (#1429) 2023-08-05 00:30:16 -07:00
Felix
97a6029cf7 Corrected a few misspelled words (#1435) 2023-08-04 16:51:08 -07:00
Adrian Kretz
043d5f2cb5 Fix NOUNROLL (#1439) 2023-08-04 16:50:19 -07:00
Francesco Castelli
579f4615a0 Add assert for wrong matmul/dot shapes (#1438) 2023-08-04 18:16:56 -04:00
Umut Zengin
52db7d7435 inf, -inf support for pad (#1436) 2023-08-04 15:05:25 -04:00
Alex Telon
7325bc914f fix: Context (#1430)
* Fixed issue in Context

* Cleaned up fix

Now that DEBUG.value = 3 always works we can do so in __new__ as well.
2023-08-04 10:53:48 -04:00
ian
c08ed1949f Fix plt output comment (#1428) 2023-08-03 23:35:52 -07:00
wozeparrot
801bed4f66 Add ops_shm (#1413)
* feat: add ops_shm

* clean: extra newline

* feat: add test

* feat: ci doesn't like that

* feat: ci still doesn't like that

* feat: skip big test on ci

* feat: testing

* feat: big

* feat: testing again

* feat: reskip test
2023-08-03 17:40:52 -07:00
chenyu
34f348643b Support constant expand to symbolic shape (#1411) 2023-08-02 21:21:22 -07:00
chenyu
6572ca6835 support symbolic expand (#1407) 2023-08-02 20:03:46 -04:00
wozeparrot
a367f71fea fix: don't put kernels into cache when optimizing (#1409) 2023-08-02 18:17:16 -04:00
Paolo Gavazzi
9ffa1eb7e2 Removed dep of torch, torchaudio, kept librosa only (#1264) 2023-08-02 13:52:04 -04:00
George Hotz
fc2303e520 gitignore in weights 2023-08-02 16:26:41 +00:00
chenyu
18d0a93f09 LazyBuffer.get_variable_buffers() (#1391)
* LazyBudder.get_variable_buffers()

* remove left_only, add ProdNode

* no vars for OpNode.b

* do not change symbolic vars, remove ProdNode
2023-08-02 09:01:35 -07:00
Umut Zengin
8889821547 Const pad support to pad2d and slice (#1392)
* slice to pad2d migrate

* Gain line

* Mypy happy

* Mypy happy

* Revert

* whitespace
2023-08-02 08:58:52 -07:00
wozeparrot
ab9e4a2e93 Make cuda CI a bit more consistent (#1403)
* feat: use fast-apt-mirror

* feat: use in more places
2023-08-02 07:38:22 -07:00
wozeparrot
7aff8c4ded cl fixes (#1402)
* feat: non-blocking

* feat: store event on buffer
2023-08-01 22:13:51 -07:00
Alex Telon
b66361843a Timing and Context can now be used as decorators (#1385)
* Context and Timing can now be used as decorators

* Using Timing decorator in quickstart.md

The time formating is better and is a useful tool to learn.

Old: Time: 3.5260659999912605
New: Time: 3526.14 ms

* Updated env_vars documentation for Context

* Added test for Context decorator

* Put new import on same line as others
2023-08-01 17:16:10 -07:00
chenyu
d9d1372dd0 Update pytest.ini format (#1398) 2023-08-01 18:00:51 -04:00
George Hotz
f4218b709f Revert "Improve Metal runtime command buffer handling (#1335)" (#1397)
This reverts commit bd54105b6b.
2023-08-01 12:10:20 -07:00
Diogo
4dc8595069 simple exporting models (#1344)
* unified exporting

* json exporting

* ignore more

* simplified buffer export

* added dtypes

* added assert

* swift example

* fix tests

* linter

* remove whitespace

* fixed tests

* remove swift example

* remove unintended changes

* allow callable models to be used

* whitespace

* more readable json export

* name change

* whitespace

* whitespace
2023-08-01 09:35:48 -07:00
wozeparrot
7c7cf16ef2 use host ptr for speed on copyouts (#1393)
* feat: use mapped buffer for speed

* fix: whoops don't need that

* feat: don't need explicit call to memoryview
2023-08-01 09:34:12 -07:00
Diogo
ba5e3818a0 Limit dims based on max size (#1390)
* working

* whitespace

* changed defaults to None

* linter

* last linter error
2023-07-31 19:18:19 -07:00
chenyu
b2fde9ec36 reshape to register variable value (#1386)
* reshape to register variable value

* better error message
2023-07-31 17:10:02 -07:00
Umut Zengin
0de5f20970 Re-open constant pad support to Tensor.pad (#1388)
* Added const padding support to .pad

* Linter
2023-07-31 17:08:57 -07:00
David Hou
3300d0aeaf syncthreads before wmma (#1389)
(venv) chaos@tiny3:~/tinygrad$ KX=2 KY=2 N=2048 python extra/gemm/hip_matmul.py
   4194304    289.60 us, would be  59322.55 GFLOPS matmul, 173.80 GB/s
2023-07-31 17:05:49 -07:00
Alex Telon
2d10e0340e Refactored ContextVars (#1331) 2023-07-31 15:44:46 -04:00
George Hotz
f27df835a6 delete dead stuff (#1382)
* delete bpe from repo

* remove yolo examples

* Revert "remove yolo examples"

This reverts commit cd1f49d466.

* no windows
2023-07-31 11:17:49 -07:00
Yixiang Gao
6e62dcfbf3 add check global dim limit in linearizer (#1299)
* need a better place for reshape and permute

* add permutation

* cuda fixed

* clean up

* enable nvidia GPU with global max

* fix order

* fix CI

* add check for global dim limit but need refactor

* refactor

* fix ignore
2023-07-31 11:14:54 -07:00
ronak69
ce0ab1c14e convert $@ to "$@" in run_multibackend.sh (#1379) 2023-07-31 10:39:22 -07:00
chenyu
f5ef445cb6 trim space (#1381) 2023-07-31 10:37:57 -07:00
JaSpa99
5ab12059da rng hlops: add normal and kaiming_normal (#1378)
* add normal and kaiming_normal

* make sure its float

* add tests
2023-07-31 10:37:02 -07:00
George Hotz
37fa7e96fb Revert "update editorconfig, enforce via CI (#1343)" (#1380)
This reverts commit da2efecbe2.
2023-07-31 10:35:50 -07:00
Pavol Rusnak
da2efecbe2 update editorconfig, enforce via CI (#1343)
* update editorconfig to set unix-style newlines and trim whitespace

* add editorconfig github action to the CI

* fix whitespace
2023-07-30 18:44:30 -07:00
S-Lykles
c2b82ea8ac fix to_shape_strides (#1374)
* add tests for expr_node and expr_idxs

* simplify condition and add missing optimization
2023-07-30 18:42:46 -07:00