Commit Graph

3293 Commits

Author SHA1 Message Date
Ankit Avinash
7647cd8428 [bounty] Stride is flip (#8792)
* replace stride with flip

* Complete replacing stride with flip

clean flip function in view.py
fix tests

* fix tests for multi shapetracker

* fix tests for fuzz shapetracker

* fix tests for fuzz shapetracker

* debug

* debug

* fix

* fix

* fix

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-01-31 11:34:10 +09:00
chenyu
0513b0c17d lower green test_gemm_8192 tflops to 125 [pr] (#8820)
flaky
2025-01-30 17:30:08 -05:00
Ignacio Sica
f0924e0857 fix and test (#8814)
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-01-30 16:35:53 -05:00
qazal
530961f7d5 realized only exists on base (#8815)
* realized only exists on base [pr]

* shorter

* update that too
2025-01-30 23:02:25 +02:00
Sieds Lykles
7cdc607544 add max as associative (#8816) 2025-01-30 16:01:42 -05:00
qazal
5643429c17 give BUFFER UOp a ShapeTracker [pr] (#8811)
* give BUFFER UOp a ShapeTracker [pr]

* move that

* update contiguous

* test_advancedindex should use movement ops
2025-01-30 22:33:32 +02:00
chenyu
5527f86a8f skip tests in test_indexing that set stride with lazydata.view [pr] (#8813) 2025-01-30 15:17:35 -05:00
nimlgen
a2faa5e49b am: fix pt free (#8810) 2025-01-30 15:14:55 +03:00
Sieds Lykles
78c0455c7a Better stable sigmoid (#8806)
Uses `1/(x*x) -> 1/x * 1/x`  together with `x/(1+x) -> 1-1/(1+x)` to
rewrite sigmoid instead of `x/((x+1)(x+1)) -> 1/(x+1)*(1-1/(x+1))`

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-01-29 16:08:53 -05:00
Ignacio Sica
260df1a17f tc_select noop (#8801)
* tc_select noop

* revert changes in test
2025-01-29 13:53:23 -05:00
qazal
ba17786068 do not construct unmasked VALID (#8759)
* new lines that exist in codegen/ops

* update tests

* update sops.gz (13071 -> 13070 asts)

* fix viz too

* remove that TODO

* diff pruning

* mask assert + device

* work

* diff pruning

* re: fix viz too

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-01-28 20:51:21 +02:00
qazal
3417bc1814 fix ShapeTracker spec for const [pr] (#8791) 2025-01-28 19:53:36 +02:00
qazal
e8be8a5835 support lowering CONST(VIEW) in lowerer (#8785) 2025-01-28 12:04:41 +02:00
George Hotz
80089536e5 Revert "move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720)" (#8786)
This reverts commit af0452f116.
2025-01-28 18:59:02 +09:00
mesozoic-egg
af0452f116 move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720)
* handle bf16 via bitcasting for CLANG and LLVM

* On LLVM, skip float16 cast

* float32 on llvm lite, float32 elsewhere

* code format

* trigger pr

* move to rewriter

---------

Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.mail>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-01-28 18:16:43 +09:00
qazal
aefbc2637f test fixups from unmasked valid deletion [pr] (#8776) 2025-01-28 09:23:30 +02:00
qazal
ed672881b0 remove additions/deletion in pr + check uops are equal [pr] (#8779)
* use warnings there [pr]

* remove those + move assert_diff [pr]

* warn after log

* remove

* back
2025-01-28 08:57:34 +02:00
George Hotz
62655e4999 move multi into engine [pr] (#8778)
* move multi into engine [pr]

* all runtime is one sz
2025-01-28 09:15:29 +09:00
Ignacio Sica
b240f12593 [TIP-9] rename Opt's amt to arg 2 (#8770)
* rename Opt amt to arg

* ignore_beam_cache for test_tiny

* move ignore_beam_cache to test_tiny

* move to separate pr

* revert space change

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-01-27 14:19:04 -05:00
Ignacio Sica
ed1b573868 ignore beam cache in test_tiny for stateless beam (#8771) 2025-01-27 12:56:30 -05:00
George Hotz
3ed146a5ff Revert "rename Opt amt to arg (#8767)" (#8769)
This reverts commit bf041659a5.
2025-01-27 23:46:37 +09:00
Ignacio Sica
bf041659a5 rename Opt amt to arg (#8767) 2025-01-27 23:36:47 +09:00
George Hotz
96bff0b4f7 contiguous is no longer needed in SGD [pr] (#8760)
* contiguous is no longer needed in SGD [pr]

* add allow condition
2025-01-27 15:19:11 +09:00
George Hotz
a9d9f98d05 hotfix: those tests fail locally on mac due to buffer count 2025-01-27 07:53:48 +09:00
qazal
ac70f63d4b tensor_map cleanups [pr] (#8754)
* tensor_map cleanups [pr]

* update test_schedule too
2025-01-26 11:41:54 +02:00
George Hotz
b53fe7c2fc remove unused ctx [pr] (#8751)
* remove unused ctx [pr]

* fix test
2025-01-26 17:59:15 +09:00
George Hotz
b4bf6a7dea switch backward to use gradient [pr] (#8235)
* switch backward to use gradient [pr]

* set device correctly, dedup

* why does that fail?

* add noop cast

* simple backward

* fix beautiful_mnist

* touchups

* set in compute_gradient

* uop_count

* uop_count was wrong

* collections

* no note

* skip that test

* update sched kernel counts

* train mnist is 65

* fix metadata and gc

* fixes

* materialize_grads

* no pathlib stuff

* add contiguous_backward, fix bugs

* add some realize

* fix multi
2025-01-26 09:12:16 +09:00
George Hotz
0ffd572e1e fix multi with no real srcs (#8749) 2025-01-26 08:41:00 +09:00
qazal
0e42befc6e viz cleanups 2 [pr] (#8748)
* viz cleanups 2 [pr]

* test_viz updates
2025-01-25 19:41:57 +02:00
qazal
a037201168 test_viz cleanups + move to /unit directory (#8746)
* test_viz cleanups + move to /unit directory

* lint
2025-01-25 14:33:31 +02:00
chenyu
e2b380b743 make UOp.multi real a tuple instead of list [pr] (#8744)
tuple is immutable. also updated test_rand_like_from_alu test
2025-01-24 20:47:27 -05:00
chenyu
e0e176efbc failed test case for multi rand_like [pr] (#8740)
new multi broke multi device dropout
2025-01-24 13:56:51 -05:00
nimlgen
dc10187fc0 am: add am_smi (#8739)
* am: start monitor

* cleanups

* fixes

* hmm

* progress

* cleanup
2025-01-24 20:16:19 +03:00
George Hotz
e82ba1454b MultiLazyBuffer is UOp [pr] (#8662)
* MultiLazyBuffer is UOp [pr]

* this is new mlb

* this is the idea

* progress

* multitensor works

* more movement ops

* this

* MultiLazyBuffer is UOp

* cleanups

* multi axis

* fix more tests

* work

* not that

* add multi grad and move shard to ops

* mops not views

* no double contig

* sweet, all mt tests passing

* port old logic

* remove lbs

* fix realized

* whitespace

* assign tweak

* test_assign_kv_cache_multi passes

* fix is_realized

* fix JIT for multi

* just a few more lines i'll pay them back soon i swear please bro just a few more

* no split reduceop for multi
2025-01-24 13:28:55 +09:00
qazal
8e5bd0cd7a fix buffer init and skip test_swizzle_failure_permute [pr] (#8732)
* fix buffer init and skip test_swizzle_failure_permute [pr]

* replace preload with just load

* add
2025-01-23 17:21:38 +02:00
nimlgen
e4512baea4 am: cleanup mm (#8730)
* am: cleanup mm

* cle

* ops

* entries
2025-01-23 15:49:37 +03:00
qazal
07ec99001a keep VIEW in big_sink + copy of buffer view spec [pr] (#8727)
* keep views in sink [pr]

* tests

* things from the gpt2 bug
2025-01-23 11:29:30 +02:00
qazal
6cb74bb630 fix using clone with shrink [pr] (#8724)
* fix using clone with shrink [pr]

* remove extra arg, add test_clone_with_shrink_realized
2025-01-23 08:28:07 +02:00
qazal
907dfa0e82 image buffer realization spec [pr] (#8420)
* image buffer realization spec [pr]

* redo the spec

* work
2025-01-22 20:25:22 +02:00
nimlgen
93fb50ce77 allreduce: add flags (#8713) 2025-01-22 17:44:31 +03:00
qazal
2dae467b75 scheduler + process_replay import cleanup (#8711) 2025-01-22 12:44:07 +02:00
qazal
e3d1464ba4 move assign preload out of schedule item [pr] (#8710)
* move assign preload out of schedule item [pr]

* fix that
2025-01-22 12:43:57 +02:00
nimlgen
c5e46c5eee am: recover from any boot interrupt (#8703)
* am: recover from any load interrupt

* add fuzzer

* nu
2025-01-21 22:22:23 +03:00
George Hotz
018edd934b don't use view in copy [pr] (#8704)
* don't use view in copy [pr]

* oh, remove double contig

* fix reps
2025-01-21 09:57:47 -08:00
qazal
d6bf1feaab remove the "no copy" line from copy_to_device (#8702)
* delete the no copy one

* add tests
2025-01-21 17:09:33 +02:00
nimlgen
3628f89929 fix deallocate for subbuffers (#8701)
* fix deallocate for subbuffers

* forgot this

* rm name

* hmm
2025-01-21 16:34:19 +03:00
qazal
f0d424ecdf Tensor UOps can become a buffer or const after scheduling (#8698)
* spec

* work

* update test_viewed_consts_do_not_realize

* remove
2025-01-21 12:33:19 +02:00
qazal
e2008c98c3 allow symbolic shape in tensor const parents [pr] (#8699) 2025-01-21 12:01:25 +02:00
qazal
66ac0087e8 more high level contiguous tests + scheduler deletions [pr] (#8695)
* delete those

* move the upat too

* rename ops_folding to just sym

* keep that
2025-01-21 01:52:58 +02:00
qazal
08eb1f1f56 simplify tensors before scheduling [pr] (#8580)
* delete forced_realize

* put that back

* work

* remove forced_realize

* expectedFailures

* contiguous(buffer)

* multi

* expectedFailures

* cleaner create_subbuffer

* more comments

* remove that

* note

* realizes

* work

* one upat and image is back

* remove

* cleaner

* fix test_complex_backward for now

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2025-01-20 23:42:42 +02:00