Commit Graph

3460 Commits

Author SHA1 Message Date
George Hotz
23b084e70a add device name to device, all are constructed (#3221) 2024-01-23 20:34:56 -08:00
George Hotz
91a1b2bd7a the runner does the build (#3220) 2024-01-23 18:45:43 -08:00
chenyu
9e5409be6c cifar move GlobalCounters.reset() before shard (#3217)
* cifar move GlobalCounters.reset() before shard

also shard mini batch inplace

* don't eval with DISABLE_BACKWARD
2024-01-23 16:07:43 -05:00
Francis Lam
595d05a250 test: fix test_linearizer to use the correct tc_dims (#3218)
also re-enable the test_tensor_core_opts
2024-01-23 16:07:31 -05:00
chenyu
3c179cc27c cifar only shuffle data at epoch start (#3216)
save 1ms CPU time per batch. also only shuffle training set
2024-01-23 14:41:22 -05:00
George Hotz
4a07ea355d buffer options should work (#3211)
* buffer options should work

* minor

* fix dtype
2024-01-22 19:23:55 -08:00
George Hotz
a06f34ae42 remove dead lines from cstyle (#3212)
* remove dead lines from cstyle

* external_local_bufs is dead

* more lines

* minor cleanup
2024-01-22 18:59:19 -08:00
chenyu
8465938d29 minor hlb_cifar cleanups (#3208)
mostly cosmetic. LATEBEAM=4 single 7900xtx 59.2 seconds
2024-01-22 12:38:39 -05:00
David Hou
3378625773 name upcast variables (#3200)
* name upcast variables

* typing

* unused
2024-01-22 11:37:28 -05:00
chenyu
827b7a3c64 cleanup pad_reflect and make_square_mask in hlb_cifar (#3206)
removed some complicated looking stuff. no wall time difference
2024-01-22 11:30:46 -05:00
chenyu
99884f4c98 cifar flags for RANDOM_CROP, RANDOM_FLIP, and CUTMIX (#3204)
experimenting with different setups, also would like to jit the data augmentation next
2024-01-22 01:12:51 -05:00
chenyu
53afec2841 add HALF to handcode_resnet50_opt.py (#3202)
use this to study tensor cores on HIP
2024-01-21 23:03:59 -05:00
chenyu
836883fedc comment out cutmix in hlb_cifar (#3201)
it's no-op with multi gpu and less STEPS. also the patch was selected from the whole dataset, not from the same batch
2024-01-21 22:24:53 -05:00
chenyu
e6c71f1b26 fix device of Tensor.arange inside Tensor.one_hot (#3199)
it should have the same device as self
2024-01-21 21:03:50 -05:00
chenyu
f7d1c42239 cleanup noop prefixes in _pool (#3198)
* cleanup noop prefixes in _pool

make expand dim=None as noop (in addition to -1). then slice, reshape, expand in _pool can share the same noop prefix

* nit

* something then reshape style

* that's repeat
2024-01-21 20:03:32 -05:00
uuuvn
640e5c36ad Fix metal tests broken by 3f56d1a (#3196)
* Remove from binary_operations before copying binary_operations into integer_binary_operations

* Also remove lt and eq if running on METAL
2024-01-21 11:53:25 -05:00
chenyu
b9d27636aa cleanup test_ops.py (#3192)
- removed exact duplicated tests
- only kept one function if torch_fxn is the same as tinygrad_fxn
- used tensor method instead of class method style
- replaced unneeded `lamdba f: f(x)` with just `f`
- re-enabled commented tests that work now
- removed some forward_only now 0 shape tensor can backward
2024-01-20 20:08:56 -05:00
chenyu
3f56d1a5e8 add operator.lt and operator.eq to test_dtype_alu (#3191)
* add operator.lt and operator.eq to test_dtype_alu

those should pass now as we have broadcasted before passing to lt and eq.
also updated the test skipping criteria to reuse test_dtype.is_dtype_supported

* llvm lt nan is incorrect

* enable truediv too

* Revert "enable truediv too"

This reverts commit df703235fb.

* just that
2024-01-20 14:54:02 -05:00
chenyu
c4b5661146 fuzz length for multitensor reduce test case (#3190)
so that the uneven case is not just with 0 length and can have other positve values
2024-01-20 00:44:38 -05:00
chenyu
fdb1c2b1d9 move reduce over 0 len axis logic to lazy.py (#3188)
* move reduce over 0 len axis logic to lazy.py

this fixed uneven shard reduce case if the uneven one has length 0

* fix interpreted backends

* fix backwards for 0 shape tensors too
2024-01-20 00:13:03 -05:00
chenyu
485332935e ring copy example (#3185)
* ring copy example

* use ones for init
2024-01-19 23:34:30 -05:00
George Hotz
254a7372fe buffer copy refactor (#3187) 2024-01-19 20:21:24 -08:00
chenyu
fb4bd2a57d reenable padto to search action (#3183) 2024-01-19 14:17:53 -05:00
chenyu
cb4cfc078a parameterize multitensor tests for reduce (#3181)
uneven shards reduce is incorrect now
2024-01-19 14:03:01 -05:00
nimlgen
5097d5b808 fix padto when with late reduce (#3180)
* fix padto test

* no long comment
2024-01-19 14:01:44 -05:00
George Hotz
729a01bf3e complex PRs will not be merged 2024-01-19 10:58:47 -08:00
nimlgen
f87ecbb0f3 fuzzer validates outputs + (partially) oob accesses (#3178)
* fuzzer validates outputs + (partially) oob accesses

* +random

* oob check only for compiled

* type cmp fixes

* fix zeroing

* no prints

* add seed
2024-01-19 13:34:51 -05:00
chenyu
b2571d586c hypothesis.st -> hypothesis.strat (#3179)
leave `st` for shapetracker
2024-01-19 11:55:26 -05:00
chenyu
c4faedebf3 add test cases for negative entry max allreduce (#3177) 2024-01-18 22:26:51 -05:00
chenyu
ab1b7c4d09 fix allreduce for max (#3175)
* test cases to show allreduce for max is incorrect

* oh fixed
2024-01-18 20:25:35 -05:00
George Hotz
c51c90bcd4 more sync in transfer (#3174) 2024-01-18 17:17:03 -08:00
chenyu
28dcbf0e00 test case sharded batchnorm has different ast on devices (#3172) 2024-01-18 18:12:15 -05:00
chenyu
a60d50487d disable padto, seems to have bug in gpt2 (#3173) 2024-01-18 18:09:30 -05:00
George Hotz
c80884884e event driven hip (#3160)
* event driven hip

* simpler, src makes copy

* pass mypy
2024-01-18 14:35:18 -08:00
George Hotz
d2aab65958 remove unused expr node (#3170)
* remove unused expr node

* still works

* simple expr_idxs

* fixup typing
2024-01-18 14:18:43 -08:00
chenyu
097b1390ec touchup test_indexing (#3169) 2024-01-18 14:32:43 -05:00
George Hotz
a04e4d0442 inline clang renderer (#3168) 2024-01-18 11:17:34 -08:00
geohotstan
efbe4788d1 indexing: Final cleanup (#3156)
* init

* feat: add _to_const_val to getitem

* doc: changed docs

* docs: updated more docs

* merge: improved/fancy

* better error msg, minor cleanups

* feat: added index_put to test_indexing

* clean: test_indexing

* revert: gather changes lol

* refactor: use dict for tracking tensor indexing, also asserts for type

* oooooooooops

* ugh

* will revert this commit xD

* fix: removed asserts

* improvement: made in-line if statement clearer

* improved err message and improved slice_int tests

* fix: recover accidentally deleted line

* finishing touches

* reword some docs and del torch device tests in test_indexing

* del some redundant tests

* revert: gather asserts, do it in seperate pr

* fix some data_ptr stuff

* done

* done done
2024-01-18 14:08:03 -05:00
chenyu
e139ae550d smaller limit_dims_to_max (#3167)
same questionable logic, but less lines now
2024-01-18 13:02:20 -05:00
nimlgen
992067399e clean up exceptions in __del__ everywhere (#3165) 2024-01-18 08:34:09 -08:00
Max-We
0338903429 Update kits19.py (#3166) 2024-01-18 08:33:50 -08:00
George Hotz
67bc2ddfd8 JIT cleanups (#3164)
* move GraphException

* factor out apply_graph_to_jit

* that return was wrong
2024-01-17 23:39:57 -08:00
George Hotz
f0c178b7e9 move get_contraction to helpers (#3162)
* move get_contraction to helpers

* move simplify

* lines

* to_movement_ops is not generic
2024-01-17 19:13:11 -08:00
chenyu
e52a609240 make WINO a context var, and LATEWINO in hlb_cifar (#3161) 2024-01-17 20:21:26 -05:00
George Hotz
ee83505fcc fix test extra issue (#3159) 2024-01-17 11:58:08 -08:00
George Hotz
9cc2577a08 use hip events (#3157)
* use hip events

* cleanup
2024-01-17 10:39:57 -08:00
chenyu
1b508e0f71 fix fuzz_linearizer toCPU to as_buffer (#3158) 2024-01-17 13:18:46 -05:00
George Hotz
743b36f0ce hotfix: copy size is in bytes 2024-01-17 16:44:15 +00:00
George Hotz
2e6162b281 graph cleanup (#3155)
* simpler graph

* unused functions
2024-01-16 20:57:31 -08:00
George Hotz
a72b1b6d65 sharding for llama (#3151)
* shard llama

* sharding works

* simpler

* simpler

* consume option

* disable that test

* save a line

---------

Co-authored-by: George Hotz <george@tinygrad.org>
2024-01-16 19:28:00 -08:00