Commit Graph

175 Commits

Author SHA1 Message Date
George Hotz
1ff341bae5 python 3.11 is now required (#13055) 2025-11-02 12:55:40 +08:00
George Hotz
7784cec48e pytest-split on spec (#12959) 2025-10-28 10:09:01 +08:00
chenyu
8baa61bd67 use torch 2.9 and its Muon in test (#12773)
* use torch 2.9 and its Muon in test

* relax and disable
2025-10-21 13:35:17 -04:00
George Hotz
1d1e1d9d88 delete the ShapeTracker (#12720)
* delete the ShapeTracker

* fix tests

* fix more

* fix gc test
2025-10-16 15:36:22 +08:00
George Hotz
0f25b4b289 move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
chenyu
12a910f1d2 update torch 2.8 (#12172)
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
chenyu
98ecab7563 remove ml_dtypes (#12169) 2025-09-14 14:20:05 -04:00
chenyu
0c392089d9 update mypy (#12155) 2025-09-13 09:48:38 -04:00
Sieds Lykles
0757a9a819 add pytest-timeout of 3 min per item (#12144)
* add pytest-timeout with timeout of 3 min

* func_only
2025-09-13 00:48:41 +02:00
Sieds Lykles
239091d111 numba>=0.55 for uv resolution (#12079)
* force numba version

* update comment
2025-09-09 01:43:32 +02:00
Jordan Chalupka
4785cd959a [TYPED=1] cvar should allow dtype as a tuple (#11770)
* cvar dtype:DType|tuple[DType, ...]|None=None

* fmt

* add a test

* list typeguard as a dep for CI

* extra step to install mypy

* fix venv

* ci fixes

* mv typeguard to testing install group

* simpler TYPED=1 test

* add typeguard to lint group
2025-08-26 12:49:51 -04:00
George Hotz
6540bb32a6 move into codegen late [pr] (#11823) 2025-08-24 10:23:25 -07:00
wozeparrot
bcc7623025 feat: bump version to 0.11.0 (#11736) 2025-08-19 17:08:56 -04:00
chenyu
4ddefbccb4 update setup packages (#11674)
sorted, and added missing 'tinygrad.frontend' and 'tinygrad.runtime.autogen.nv'
2025-08-14 19:24:57 -04:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
George Hotz
067daee5be pin torch to 2.7.1 (#11519) 2025-08-05 15:58:57 -07:00
George Hotz
842184a1ab rename kernelize to schedule, try 2 (#11305) 2025-07-21 11:18:36 -07:00
geohotstan
536b254df4 Bump onnx to 1.18.0 (#11266)
* bump

* thou hast implement functions

* hacked in domain support

* some clean ups

* hack quantize_onnx_test too

* add helper lol, why onnx tests why

* better dispatcher, but need tests and better naming

* flaky ci

* change some names

* small clean ups

* make it easier to clean up tests once ORT supports 1.18.0

* nits

* fix bug of Softmax_1 being registered in onnx_ops

* need a default value

* resolve_const is better name

* fix OnnxRunner.to

* use proper domain names
2025-07-17 15:35:41 -04:00
George Hotz
397826f0b4 add a test for 1B llm (#11124)
* add a test for 1B llm

* fix mbs

* add apps to release
2025-07-07 18:47:25 -07:00
qazal
1127302c46 move perfetto to extra (#10994)
* move perfetto to extra

* update TestViz and fix tests

* remove perfetto.html from viz directory

* work

* mypy
2025-06-27 01:53:54 +03:00
nimlgen
1c45b9f7fb start nvpci (#10521)
* start nvpci

* talk to fsp

* boot args

* riscv core bootted

* q

* agen

* got gsp init msg

* some fixes

* set registry, stuck aft lockdown(

* start ga/ad port

* gsp init on ada

* more classes allocated

* more

* mm

* fixes and progress

* no huge pages for now

* mm seems workin, but switch to 512mb page for simplicity

* working state

* not cleaned

* claned

* nvd=1

* start gr ctx

* compute

* clean 1

* cleanup 2

* cleanup 3

* cleaner 4

* cleaner 6

* add iface to nv

* save before reboot

* merged into NV

* moveout mm

* post merge

* cleaner 7

* merge and rebase

* pciiface abstraction + reset

* download fw from web

* print logs

* minor changes + p2p

* cleaner 8

* cleaner 9

* cleaner 10

* delete

* delete this as well

* linter 1

* oops

* priv_client -> priv_root

* fix mypy

* mypy?

* mypy?

* small changes

* shorter

* ops

* remove this

* do not allocate paddr for reserve

* nodiff

* unified script

* ops

* dif ver

* add lock

* setup
2025-06-25 00:37:34 +03:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
7ff175c022 cache a venv to avoid pip usage (#10689)
* try built in pip caching

* try venv

* export venv

* set VIRTUAL_ENV

* revert that

* venv key

* fix

* ci cache hit?

* fix windows
2025-06-07 20:13:41 -07:00
qazal
ed37f29184 remove unused lib directory from viz setup [pr] (#10639) 2025-06-05 13:54:31 +03:00
Fang-Pen Lin
b0913295d2 Add missing js files in python package data for viz (#10624) 2025-06-04 10:49:43 +03:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
wozeparrot
1ed04f993b move benchmark stat tracking to influxdb (#10185) 2025-05-15 16:14:56 -07:00
wozeparrot
9b14e8c3cd feat: tag 0.10.3 (#10310) 2025-05-14 15:45:13 -07:00
wozeparrot
2df2ec6640 feat: unpin hypothesis (#10306) 2025-05-14 14:26:28 -07:00
chenyu
61bfd23881 update mlperf-logging version (#9995) 2025-04-22 19:32:39 -04:00
pkotzbach
dbbd755cba FP8s truncate (#9937)
* truncate fp8

* fix

* maybe like that?

* fix linters

* ruff

* move from extra and add ml_types to tests

* minor changes

* str to dtypes and nan support

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
2025-04-22 19:12:49 -04:00
chenyu
fe6a482f1d pin hypothesis version to 6.131.0 (#9920)
6.131.1 seems to cause timeout in CI
2025-04-17 16:34:10 -04:00
chenyu
57f4bc3fbb add numpy to setup linting (#9806)
this would have caught the mypy error in fp8 pr. keep ignore_missing_imports to true as we also import torch which is fat
2025-04-09 03:47:03 -04:00
George Hotz
78caf55154 Revert "FP8 support on NVIDIA (#8631)"
This reverts commit 2c8e4ea865.
2025-04-09 12:27:41 +08:00
pkotzbach
2c8e4ea865 FP8 support on NVIDIA (#8631)
* squashed fp8 commits

* tensorcore start

* minor changes

* pre-commit

* pylint

* Delete fp8mul.cu

* clean

* small bugfix

* fix test_dtype

* fix test_dtype_alu

* add EMULATE_CUDA_SM89

* fix ci

* fix test_linearizer

* fix test_linearizer

* fix swizzle

* add debug to simple_matmul

* fixed swizzle

* python emulator

* refactor python emulator

* setup fix

* numpy setup

* ml_dtypes only in emulate_cuda_sm89

* fix pylint

* fix tests

* fix mypy

* fix mypy

* fix ruff

* done python emulator

* add acc type

* tests

* mypy

* clean code

* add cuda tensor core tests to CI

* minor fix

* clean test_dtype.py

* clean cstyle.py

* clean test_ops.py

* fix test

* fix test

* whitespaces

* pylint

* pylint

* amd?

* amd?

* amd

* reduce lines

* mockgpu remove

* fix

* ruff

* ruff

* fix mypy

* ruff

* test only for cuda

* fixed formatting

* small fixes

* small fix

* least_upper_dtype if fp8s not supported

* log and reciprocal are supported for fp8s

* ops python fixes

* dtypes.fp8s use

* e4m3 + e5m2 result dtype test

* truncate linter fix

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-04-08 21:54:04 -04:00
chenyu
8fe83385ec add system json for mi300x mlperf (#9786)
* add system json for mi300x mlperf

```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v5.0/tinycorp/systems/tinybox_8xMI300X.json training 4.1.0
INFO -   System description checker passed for tinybox 8xMI300X
```

also removed the rocm from tinybox_red since we are not using it

* update mlperf-logging version
2025-04-08 06:36:44 -04:00
chenyu
4a807ee952 remove duplicated z3-solver in setup.py (#9787) 2025-04-08 06:12:58 -04:00
Sieds Lykles
07d1aefaf4 fast idiv (#9755)
* fast idiv with tests and fuzzer

* Add todo comment

* Add env variable to toggle fast_idiv

* Move env check

* Add fuzz fast_idiv to ci

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-04-07 08:32:24 -04:00
chenyu
c20f112e9f example test use z3 to verify valid simplification (#9684) 2025-04-02 01:05:52 -04:00
geohotstan
a08b07b4da Bump onnx==1.17.0 (#9618)
* bump

* remove resize tf_crop_and_resize

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 03:21:51 -04:00
chenyu
ee3d313b34 Revert "update ruff to 0.11.2 (#9531)" (#9535)
This reverts commit d8d65e2747.
2025-03-21 14:52:25 -04:00
Francis Lata
eb95825eea RetinaNet dataloader (#9442)
* retinanet dataloader

* remove batch_size from generate_anchors

* refactor kits19 dataset tests

* add tests for dataloader

* fix testing setup and cleanups

* remove unused import
2025-03-21 13:36:41 -04:00
chenyu
d8d65e2747 update ruff to 0.11.2 (#9531)
0.11.2 fixed the false alert from 0.11.1. also pinned the version in setup for now to prevent broken CI from ruff upgrade
2025-03-21 10:32:59 -04:00
geohotstan
f0b24d230c add test_onnx_ops.py (#8569)
* boom

* fix webgpu

* use exact variable names in test so that AI can read easier

* add tag for specific test name like test a specific dtype

* fix ruff

* astype everything

* dtype in array creation

* just arange

* is 67% considered fixed?

* move test up

* small cleanups

* share function

* add qgemm as well

* add qgemm too

* make sure qgemm comes out as int

* take out qgemm for now

* fixed test

* add correct qgemm

* addressing feedback here too, early naive fix for now

* simplify bias and c to be minimalistic enough to test correctness

* refactored qlinearops

* maybe these asserts aren't the best..

* fix test

* updated tests to cover new ops

* try to add to CI

* move test_onnx_ops into testextra/

* more attention tests

* qlinear_add atol=1

* attention still not fullllllly correct

* it is what it is

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-02-24 16:15:22 -05:00
qazal
1db4341e9f move viz graph to lib/graph [pr] (#9196)
* move viz graph to lib/graph [pr]

* add package

* share with program
2025-02-21 21:04:07 +01:00
Simon R
2318d7ac51 Add missing tinygrad.runtime.autogen.am to packages (#9194) 2025-02-21 15:38:24 +02:00
George Hotz
d3a21cced2 hotfix: bump version to 0.10.2 2025-02-21 10:43:49 +08:00
chenyu
3e22747799 run unit test on windows ci (#9187)
* factor out testing_minimal in setup.py [pr]

* testing_unit + windows
2025-02-20 14:40:41 -05:00
chenyu
287de4ecc6 use torch in test_gradient (#9186)
used torch.autograd.grad, but not sure if it can be a template like jax
2025-02-20 12:26:11 -05:00