Commit Graph

10417 Commits

Author SHA1 Message Date
George Hotz
de7a3a56ff save lines in llvm (#3231)
* save lines in llvm

* no implied cast in load

* no cast in gate
2024-01-24 11:40:53 -08:00
George Hotz
83d614295e reduce lines (#3230) 2024-01-24 10:35:59 -08:00
chenyu
afeadbedc9 touch up Tensor.round and Tensor.neg (#3228) 2024-01-24 12:29:37 -05:00
Obada Khalili
0e103b4aa0 implement Tensor.round (#3225) 2024-01-24 11:49:17 -05:00
geohotstan
842053873d fix neg logical_not inconsistencies (#3222)
* try

* test: add logical_not tests

* gah im retarded, but this doesn't match types for const()

* fix: can't we jsut do this?

* big change: I don't actually know what I'm doing

* WOOO IM JUST CHANGING EVERYTHING WOW probably gon revert later

* BYE BYE noqa: E501

* fix: less lines and add test

* fix: rm 2 redundant tests

* fix: eq with False so we don't unintentionally implicit upcast, but it's bool anyways so w/e
2024-01-24 11:48:40 -05:00
George Hotz
e2e4632aea LoadOps SYNC (#3223)
* LoadOps SYNC and WAIT

* no wait, only sync

* DEBUG >= 1

* track cross device
2024-01-23 21:59:18 -08:00
chenyu
2f4b3ab1c0 shard and to should preserve requires_grad (#3224)
dtypes are inferred from underlying lazydata, requires_grad needs to be passed explicitly
2024-01-24 00:15:10 -05:00
George Hotz
23b084e70a add device name to device, all are constructed (#3221) 2024-01-23 20:34:56 -08:00
George Hotz
91a1b2bd7a the runner does the build (#3220) 2024-01-23 18:45:43 -08:00
chenyu
9e5409be6c cifar move GlobalCounters.reset() before shard (#3217)
* cifar move GlobalCounters.reset() before shard

also shard mini batch inplace

* don't eval with DISABLE_BACKWARD
2024-01-23 16:07:43 -05:00
Francis Lam
595d05a250 test: fix test_linearizer to use the correct tc_dims (#3218)
also re-enable the test_tensor_core_opts
2024-01-23 16:07:31 -05:00
chenyu
3c179cc27c cifar only shuffle data at epoch start (#3216)
save 1ms CPU time per batch. also only shuffle training set
2024-01-23 14:41:22 -05:00
George Hotz
4a07ea355d buffer options should work (#3211)
* buffer options should work

* minor

* fix dtype
2024-01-22 19:23:55 -08:00
George Hotz
a06f34ae42 remove dead lines from cstyle (#3212)
* remove dead lines from cstyle

* external_local_bufs is dead

* more lines

* minor cleanup
2024-01-22 18:59:19 -08:00
chenyu
8465938d29 minor hlb_cifar cleanups (#3208)
mostly cosmetic. LATEBEAM=4 single 7900xtx 59.2 seconds
2024-01-22 12:38:39 -05:00
David Hou
3378625773 name upcast variables (#3200)
* name upcast variables

* typing

* unused
2024-01-22 11:37:28 -05:00
chenyu
827b7a3c64 cleanup pad_reflect and make_square_mask in hlb_cifar (#3206)
removed some complicated looking stuff. no wall time difference
2024-01-22 11:30:46 -05:00
chenyu
99884f4c98 cifar flags for RANDOM_CROP, RANDOM_FLIP, and CUTMIX (#3204)
experimenting with different setups, also would like to jit the data augmentation next
2024-01-22 01:12:51 -05:00
chenyu
53afec2841 add HALF to handcode_resnet50_opt.py (#3202)
use this to study tensor cores on HIP
2024-01-21 23:03:59 -05:00
chenyu
836883fedc comment out cutmix in hlb_cifar (#3201)
it's no-op with multi gpu and less STEPS. also the patch was selected from the whole dataset, not from the same batch
2024-01-21 22:24:53 -05:00
chenyu
e6c71f1b26 fix device of Tensor.arange inside Tensor.one_hot (#3199)
it should have the same device as self
2024-01-21 21:03:50 -05:00
chenyu
f7d1c42239 cleanup noop prefixes in _pool (#3198)
* cleanup noop prefixes in _pool

make expand dim=None as noop (in addition to -1). then slice, reshape, expand in _pool can share the same noop prefix

* nit

* something then reshape style

* that's repeat
2024-01-21 20:03:32 -05:00
uuuvn
640e5c36ad Fix metal tests broken by 3f56d1a (#3196)
* Remove from binary_operations before copying binary_operations into integer_binary_operations

* Also remove lt and eq if running on METAL
2024-01-21 11:53:25 -05:00
chenyu
b9d27636aa cleanup test_ops.py (#3192)
- removed exact duplicated tests
- only kept one function if torch_fxn is the same as tinygrad_fxn
- used tensor method instead of class method style
- replaced unneeded `lamdba f: f(x)` with just `f`
- re-enabled commented tests that work now
- removed some forward_only now 0 shape tensor can backward
2024-01-20 20:08:56 -05:00
chenyu
3f56d1a5e8 add operator.lt and operator.eq to test_dtype_alu (#3191)
* add operator.lt and operator.eq to test_dtype_alu

those should pass now as we have broadcasted before passing to lt and eq.
also updated the test skipping criteria to reuse test_dtype.is_dtype_supported

* llvm lt nan is incorrect

* enable truediv too

* Revert "enable truediv too"

This reverts commit df703235fb.

* just that
2024-01-20 14:54:02 -05:00
chenyu
c4b5661146 fuzz length for multitensor reduce test case (#3190)
so that the uneven case is not just with 0 length and can have other positve values
2024-01-20 00:44:38 -05:00
chenyu
fdb1c2b1d9 move reduce over 0 len axis logic to lazy.py (#3188)
* move reduce over 0 len axis logic to lazy.py

this fixed uneven shard reduce case if the uneven one has length 0

* fix interpreted backends

* fix backwards for 0 shape tensors too
2024-01-20 00:13:03 -05:00
chenyu
485332935e ring copy example (#3185)
* ring copy example

* use ones for init
2024-01-19 23:34:30 -05:00
George Hotz
254a7372fe buffer copy refactor (#3187) 2024-01-19 20:21:24 -08:00
chenyu
fb4bd2a57d reenable padto to search action (#3183) 2024-01-19 14:17:53 -05:00
chenyu
cb4cfc078a parameterize multitensor tests for reduce (#3181)
uneven shards reduce is incorrect now
2024-01-19 14:03:01 -05:00
nimlgen
5097d5b808 fix padto when with late reduce (#3180)
* fix padto test

* no long comment
2024-01-19 14:01:44 -05:00
George Hotz
729a01bf3e complex PRs will not be merged 2024-01-19 10:58:47 -08:00
nimlgen
f87ecbb0f3 fuzzer validates outputs + (partially) oob accesses (#3178)
* fuzzer validates outputs + (partially) oob accesses

* +random

* oob check only for compiled

* type cmp fixes

* fix zeroing

* no prints

* add seed
2024-01-19 13:34:51 -05:00
chenyu
b2571d586c hypothesis.st -> hypothesis.strat (#3179)
leave `st` for shapetracker
2024-01-19 11:55:26 -05:00
chenyu
c4faedebf3 add test cases for negative entry max allreduce (#3177) 2024-01-18 22:26:51 -05:00
chenyu
ab1b7c4d09 fix allreduce for max (#3175)
* test cases to show allreduce for max is incorrect

* oh fixed
2024-01-18 20:25:35 -05:00
George Hotz
c51c90bcd4 more sync in transfer (#3174) 2024-01-18 17:17:03 -08:00
chenyu
28dcbf0e00 test case sharded batchnorm has different ast on devices (#3172) 2024-01-18 18:12:15 -05:00
chenyu
a60d50487d disable padto, seems to have bug in gpt2 (#3173) 2024-01-18 18:09:30 -05:00
George Hotz
c80884884e event driven hip (#3160)
* event driven hip

* simpler, src makes copy

* pass mypy
2024-01-18 14:35:18 -08:00
George Hotz
d2aab65958 remove unused expr node (#3170)
* remove unused expr node

* still works

* simple expr_idxs

* fixup typing
2024-01-18 14:18:43 -08:00
chenyu
097b1390ec touchup test_indexing (#3169) 2024-01-18 14:32:43 -05:00
George Hotz
a04e4d0442 inline clang renderer (#3168) 2024-01-18 11:17:34 -08:00
geohotstan
efbe4788d1 indexing: Final cleanup (#3156)
* init

* feat: add _to_const_val to getitem

* doc: changed docs

* docs: updated more docs

* merge: improved/fancy

* better error msg, minor cleanups

* feat: added index_put to test_indexing

* clean: test_indexing

* revert: gather changes lol

* refactor: use dict for tracking tensor indexing, also asserts for type

* oooooooooops

* ugh

* will revert this commit xD

* fix: removed asserts

* improvement: made in-line if statement clearer

* improved err message and improved slice_int tests

* fix: recover accidentally deleted line

* finishing touches

* reword some docs and del torch device tests in test_indexing

* del some redundant tests

* revert: gather asserts, do it in seperate pr

* fix some data_ptr stuff

* done

* done done
2024-01-18 14:08:03 -05:00
chenyu
e139ae550d smaller limit_dims_to_max (#3167)
same questionable logic, but less lines now
2024-01-18 13:02:20 -05:00
nimlgen
992067399e clean up exceptions in __del__ everywhere (#3165) 2024-01-18 08:34:09 -08:00
Max-We
0338903429 Update kits19.py (#3166) 2024-01-18 08:33:50 -08:00
George Hotz
67bc2ddfd8 JIT cleanups (#3164)
* move GraphException

* factor out apply_graph_to_jit

* that return was wrong
2024-01-17 23:39:57 -08:00
George Hotz
f0c178b7e9 move get_contraction to helpers (#3162)
* move get_contraction to helpers

* move simplify

* lines

* to_movement_ops is not generic
2024-01-17 19:13:11 -08:00