Commit Graph

171 Commits

Author SHA1 Message Date
George Hotz
0f25b4b289 move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
chenyu
12a910f1d2 update torch 2.8 (#12172)
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
chenyu
98ecab7563 remove ml_dtypes (#12169) 2025-09-14 14:20:05 -04:00
chenyu
0c392089d9 update mypy (#12155) 2025-09-13 09:48:38 -04:00
Sieds Lykles
0757a9a819 add pytest-timeout of 3 min per item (#12144)
* add pytest-timeout with timeout of 3 min

* func_only
2025-09-13 00:48:41 +02:00
Sieds Lykles
239091d111 numba>=0.55 for uv resolution (#12079)
* force numba version

* update comment
2025-09-09 01:43:32 +02:00
Jordan Chalupka
4785cd959a [TYPED=1] cvar should allow dtype as a tuple (#11770)
* cvar dtype:DType|tuple[DType, ...]|None=None

* fmt

* add a test

* list typeguard as a dep for CI

* extra step to install mypy

* fix venv

* ci fixes

* mv typeguard to testing install group

* simpler TYPED=1 test

* add typeguard to lint group
2025-08-26 12:49:51 -04:00
George Hotz
6540bb32a6 move into codegen late [pr] (#11823) 2025-08-24 10:23:25 -07:00
wozeparrot
bcc7623025 feat: bump version to 0.11.0 (#11736) 2025-08-19 17:08:56 -04:00
chenyu
4ddefbccb4 update setup packages (#11674)
sorted, and added missing 'tinygrad.frontend' and 'tinygrad.runtime.autogen.nv'
2025-08-14 19:24:57 -04:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
George Hotz
067daee5be pin torch to 2.7.1 (#11519) 2025-08-05 15:58:57 -07:00
George Hotz
842184a1ab rename kernelize to schedule, try 2 (#11305) 2025-07-21 11:18:36 -07:00
geohotstan
536b254df4 Bump onnx to 1.18.0 (#11266)
* bump

* thou hast implement functions

* hacked in domain support

* some clean ups

* hack quantize_onnx_test too

* add helper lol, why onnx tests why

* better dispatcher, but need tests and better naming

* flaky ci

* change some names

* small clean ups

* make it easier to clean up tests once ORT supports 1.18.0

* nits

* fix bug of Softmax_1 being registered in onnx_ops

* need a default value

* resolve_const is better name

* fix OnnxRunner.to

* use proper domain names
2025-07-17 15:35:41 -04:00
George Hotz
397826f0b4 add a test for 1B llm (#11124)
* add a test for 1B llm

* fix mbs

* add apps to release
2025-07-07 18:47:25 -07:00
qazal
1127302c46 move perfetto to extra (#10994)
* move perfetto to extra

* update TestViz and fix tests

* remove perfetto.html from viz directory

* work

* mypy
2025-06-27 01:53:54 +03:00
nimlgen
1c45b9f7fb start nvpci (#10521)
* start nvpci

* talk to fsp

* boot args

* riscv core bootted

* q

* agen

* got gsp init msg

* some fixes

* set registry, stuck aft lockdown(

* start ga/ad port

* gsp init on ada

* more classes allocated

* more

* mm

* fixes and progress

* no huge pages for now

* mm seems workin, but switch to 512mb page for simplicity

* working state

* not cleaned

* claned

* nvd=1

* start gr ctx

* compute

* clean 1

* cleanup 2

* cleanup 3

* cleaner 4

* cleaner 6

* add iface to nv

* save before reboot

* merged into NV

* moveout mm

* post merge

* cleaner 7

* merge and rebase

* pciiface abstraction + reset

* download fw from web

* print logs

* minor changes + p2p

* cleaner 8

* cleaner 9

* cleaner 10

* delete

* delete this as well

* linter 1

* oops

* priv_client -> priv_root

* fix mypy

* mypy?

* mypy?

* small changes

* shorter

* ops

* remove this

* do not allocate paddr for reserve

* nodiff

* unified script

* ops

* dif ver

* add lock

* setup
2025-06-25 00:37:34 +03:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
7ff175c022 cache a venv to avoid pip usage (#10689)
* try built in pip caching

* try venv

* export venv

* set VIRTUAL_ENV

* revert that

* venv key

* fix

* ci cache hit?

* fix windows
2025-06-07 20:13:41 -07:00
qazal
ed37f29184 remove unused lib directory from viz setup [pr] (#10639) 2025-06-05 13:54:31 +03:00
Fang-Pen Lin
b0913295d2 Add missing js files in python package data for viz (#10624) 2025-06-04 10:49:43 +03:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
wozeparrot
1ed04f993b move benchmark stat tracking to influxdb (#10185) 2025-05-15 16:14:56 -07:00
wozeparrot
9b14e8c3cd feat: tag 0.10.3 (#10310) 2025-05-14 15:45:13 -07:00
wozeparrot
2df2ec6640 feat: unpin hypothesis (#10306) 2025-05-14 14:26:28 -07:00
chenyu
61bfd23881 update mlperf-logging version (#9995) 2025-04-22 19:32:39 -04:00
pkotzbach
dbbd755cba FP8s truncate (#9937)
* truncate fp8

* fix

* maybe like that?

* fix linters

* ruff

* move from extra and add ml_types to tests

* minor changes

* str to dtypes and nan support

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
2025-04-22 19:12:49 -04:00
chenyu
fe6a482f1d pin hypothesis version to 6.131.0 (#9920)
6.131.1 seems to cause timeout in CI
2025-04-17 16:34:10 -04:00
chenyu
57f4bc3fbb add numpy to setup linting (#9806)
this would have caught the mypy error in fp8 pr. keep ignore_missing_imports to true as we also import torch which is fat
2025-04-09 03:47:03 -04:00
George Hotz
78caf55154 Revert "FP8 support on NVIDIA (#8631)"
This reverts commit 2c8e4ea865.
2025-04-09 12:27:41 +08:00
pkotzbach
2c8e4ea865 FP8 support on NVIDIA (#8631)
* squashed fp8 commits

* tensorcore start

* minor changes

* pre-commit

* pylint

* Delete fp8mul.cu

* clean

* small bugfix

* fix test_dtype

* fix test_dtype_alu

* add EMULATE_CUDA_SM89

* fix ci

* fix test_linearizer

* fix test_linearizer

* fix swizzle

* add debug to simple_matmul

* fixed swizzle

* python emulator

* refactor python emulator

* setup fix

* numpy setup

* ml_dtypes only in emulate_cuda_sm89

* fix pylint

* fix tests

* fix mypy

* fix mypy

* fix ruff

* done python emulator

* add acc type

* tests

* mypy

* clean code

* add cuda tensor core tests to CI

* minor fix

* clean test_dtype.py

* clean cstyle.py

* clean test_ops.py

* fix test

* fix test

* whitespaces

* pylint

* pylint

* amd?

* amd?

* amd

* reduce lines

* mockgpu remove

* fix

* ruff

* ruff

* fix mypy

* ruff

* test only for cuda

* fixed formatting

* small fixes

* small fix

* least_upper_dtype if fp8s not supported

* log and reciprocal are supported for fp8s

* ops python fixes

* dtypes.fp8s use

* e4m3 + e5m2 result dtype test

* truncate linter fix

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-04-08 21:54:04 -04:00
chenyu
8fe83385ec add system json for mi300x mlperf (#9786)
* add system json for mi300x mlperf

```
python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v5.0/tinycorp/systems/tinybox_8xMI300X.json training 4.1.0
INFO -   System description checker passed for tinybox 8xMI300X
```

also removed the rocm from tinybox_red since we are not using it

* update mlperf-logging version
2025-04-08 06:36:44 -04:00
chenyu
4a807ee952 remove duplicated z3-solver in setup.py (#9787) 2025-04-08 06:12:58 -04:00
Sieds Lykles
07d1aefaf4 fast idiv (#9755)
* fast idiv with tests and fuzzer

* Add todo comment

* Add env variable to toggle fast_idiv

* Move env check

* Add fuzz fast_idiv to ci

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-04-07 08:32:24 -04:00
chenyu
c20f112e9f example test use z3 to verify valid simplification (#9684) 2025-04-02 01:05:52 -04:00
geohotstan
a08b07b4da Bump onnx==1.17.0 (#9618)
* bump

* remove resize tf_crop_and_resize

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 03:21:51 -04:00
chenyu
ee3d313b34 Revert "update ruff to 0.11.2 (#9531)" (#9535)
This reverts commit d8d65e2747.
2025-03-21 14:52:25 -04:00
Francis Lata
eb95825eea RetinaNet dataloader (#9442)
* retinanet dataloader

* remove batch_size from generate_anchors

* refactor kits19 dataset tests

* add tests for dataloader

* fix testing setup and cleanups

* remove unused import
2025-03-21 13:36:41 -04:00
chenyu
d8d65e2747 update ruff to 0.11.2 (#9531)
0.11.2 fixed the false alert from 0.11.1. also pinned the version in setup for now to prevent broken CI from ruff upgrade
2025-03-21 10:32:59 -04:00
geohotstan
f0b24d230c add test_onnx_ops.py (#8569)
* boom

* fix webgpu

* use exact variable names in test so that AI can read easier

* add tag for specific test name like test a specific dtype

* fix ruff

* astype everything

* dtype in array creation

* just arange

* is 67% considered fixed?

* move test up

* small cleanups

* share function

* add qgemm as well

* add qgemm too

* make sure qgemm comes out as int

* take out qgemm for now

* fixed test

* add correct qgemm

* addressing feedback here too, early naive fix for now

* simplify bias and c to be minimalistic enough to test correctness

* refactored qlinearops

* maybe these asserts aren't the best..

* fix test

* updated tests to cover new ops

* try to add to CI

* move test_onnx_ops into testextra/

* more attention tests

* qlinear_add atol=1

* attention still not fullllllly correct

* it is what it is

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-02-24 16:15:22 -05:00
qazal
1db4341e9f move viz graph to lib/graph [pr] (#9196)
* move viz graph to lib/graph [pr]

* add package

* share with program
2025-02-21 21:04:07 +01:00
Simon R
2318d7ac51 Add missing tinygrad.runtime.autogen.am to packages (#9194) 2025-02-21 15:38:24 +02:00
George Hotz
d3a21cced2 hotfix: bump version to 0.10.2 2025-02-21 10:43:49 +08:00
chenyu
3e22747799 run unit test on windows ci (#9187)
* factor out testing_minimal in setup.py [pr]

* testing_unit + windows
2025-02-20 14:40:41 -05:00
chenyu
287de4ecc6 use torch in test_gradient (#9186)
used torch.autograd.grad, but not sure if it can be a template like jax
2025-02-20 12:26:11 -05:00
qazal
574a905291 Fix running VIZ=1 after package installation + test (#9183)
* test running viz from pip install

* add pkg

* do 10 connection attempts

* include assets in package_data

* quiet curl

* better print
2025-02-20 15:02:00 +01:00
George Hotz
4de084a835 cleanup ci, split docs/autogen, testing_minimal, LLVM Speed [pr] (#8952)
* cleanup ci [pr]

* testing_minimal

* add hypothesis to minimal

* fail tiktoken import okay

* add LLVM speed test

* llvm speed w/o beam
2025-02-07 19:01:59 +08:00
Ahmed Harmouche
133cacadde Autogen webgpu dawn, removing wgpu-py dependency (f16 support part 1) (#8646)
* Switch to dawn, all tests passing locally

* Use dawn-python

* Skip failing test

* Skip midcast and fix timestamp on metal ci

* Autogen webgpu

* Try fetch dawn lib again

* /usr/lib

* Without lib prefix

* Test autogen diff

* Delete webgpu support, move everything to ops_webgpu

* mypy fix

* Simplify, refactor

* Line savings

* No ResultContainer

* Type annotation for result

* Some more simplifications

* Why was this explicit sync used at all?

* Refactor: delete functions that are only used once

* Create shader module inline

* Clear unit tests cache, maybe that solves it

* That wasn't it

* Try deleting cache to pass failing weight compare

* weights_only=False for pytorch 2.6

* Simplify ctype array creation

* Remove nanosecond precision timestamps

* Simplify error handling

* Refactor, add back type annotations

* Deleted custom submit function, refactor

* read_buffer simplify

* Fix use after free, refactor

* Simplify supported_features

* Runtime docs

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-02-07 15:16:59 +08:00
George Hotz
5844883e59 bump master version 2025-02-05 09:08:28 +08:00