Commit Graph

7626 Commits

Author SHA1 Message Date
George Hotz
2454bf01c3 hotfix: remove shapetracker spam in viz 2025-01-27 07:20:21 +09:00
qazal
d488bbb1ec share merge_views/valid creation for CONST/DEFINE_VAR (#8758)
* share valid creation behavior for CONST/DEFINE_VAR

* work
2025-01-26 17:41:54 +02:00
qazal
bbb2dd8141 move VALID creation after merging the views (#8757)
* do valid creation later

* work for view_left

* only view(const) makes valids in view_left

* cleaner bind diff
2025-01-26 16:58:05 +02:00
George Hotz
a6e496b195 remove Function class [pr] (#8753)
* remove Function class [pr]

* actually remove function

* fix docs
2025-01-26 18:58:02 +09:00
qazal
ac70f63d4b tensor_map cleanups [pr] (#8754)
* tensor_map cleanups [pr]

* update test_schedule too
2025-01-26 11:41:54 +02:00
George Hotz
b53fe7c2fc remove unused ctx [pr] (#8751)
* remove unused ctx [pr]

* fix test
2025-01-26 17:59:15 +09:00
qazal
06b58aa7ec move unneeded fields out of ScheduleContext [pr] (#8752) 2025-01-26 10:36:15 +02:00
George Hotz
1b4618e257 gradient cleanup (#8750)
* switch backward to use gradient [pr]

* set device correctly, dedup

* why does that fail?

* add noop cast

* simple backward

* fix beautiful_mnist

* touchups

* set in compute_gradient

* uop_count

* uop_count was wrong

* collections

* no note

* skip that test

* update sched kernel counts

* train mnist is 65

* fix metadata and gc

* fixes

* materialize_grads

* no pathlib stuff

* add contiguous_backward, fix bugs

* add some realize

* fix multi

* remove unused backward passes [pr]

* lower line count
2025-01-26 09:30:55 +09:00
George Hotz
b4bf6a7dea switch backward to use gradient [pr] (#8235)
* switch backward to use gradient [pr]

* set device correctly, dedup

* why does that fail?

* add noop cast

* simple backward

* fix beautiful_mnist

* touchups

* set in compute_gradient

* uop_count

* uop_count was wrong

* collections

* no note

* skip that test

* update sched kernel counts

* train mnist is 65

* fix metadata and gc

* fixes

* materialize_grads

* no pathlib stuff

* add contiguous_backward, fix bugs

* add some realize

* fix multi
2025-01-26 09:12:16 +09:00
George Hotz
0ffd572e1e fix multi with no real srcs (#8749) 2025-01-26 08:41:00 +09:00
qazal
0e42befc6e viz cleanups 2 [pr] (#8748)
* viz cleanups 2 [pr]

* test_viz updates
2025-01-25 19:41:57 +02:00
nimlgen
c74c5901a8 am disable bind (#8747) 2025-01-25 19:06:35 +03:00
qazal
a037201168 test_viz cleanups + move to /unit directory (#8746)
* test_viz cleanups + move to /unit directory

* lint
2025-01-25 14:33:31 +02:00
chenyu
e2b380b743 make UOp.multi real a tuple instead of list [pr] (#8744)
tuple is immutable. also updated test_rand_like_from_alu test
2025-01-24 20:47:27 -05:00
George Hotz
cb0978b377 add Ops.CONTIGUOUS_BACKWARD (#8743) 2025-01-25 07:28:43 +09:00
nimlgen
2f06eccf1d am: script and vfio msg (#8742)
* am: script and vfio msg

* use sysfs bars always for now

* tiny chnages
2025-01-25 00:33:00 +03:00
chenyu
0c759e1ff6 add bert to bechmark ci (#8741)
with `DISABLE_DROPOUT=1 BERT_LAYERS=2` for now
2025-01-24 14:45:11 -05:00
chenyu
e0e176efbc failed test case for multi rand_like [pr] (#8740)
new multi broke multi device dropout
2025-01-24 13:56:51 -05:00
nimlgen
dc10187fc0 am: add am_smi (#8739)
* am: start monitor

* cleanups

* fixes

* hmm

* progress

* cleanup
2025-01-24 20:16:19 +03:00
George Hotz
7a2223a6c6 add merge views to ops_folding [pr] (#8051)
Co-authored-by: qazal <qazal.software@gmail.com>
2025-01-24 17:45:11 +02:00
qazal
0814a79cb4 cleanup the merge_views upats [pr] (#8738) 2025-01-24 16:49:54 +02:00
qazal
07069b9988 rename to tensor_uop [pr] (#8737) 2025-01-24 13:42:25 +02:00
George Hotz
e82ba1454b MultiLazyBuffer is UOp [pr] (#8662)
* MultiLazyBuffer is UOp [pr]

* this is new mlb

* this is the idea

* progress

* multitensor works

* more movement ops

* this

* MultiLazyBuffer is UOp

* cleanups

* multi axis

* fix more tests

* work

* not that

* add multi grad and move shard to ops

* mops not views

* no double contig

* sweet, all mt tests passing

* port old logic

* remove lbs

* fix realized

* whitespace

* assign tweak

* test_assign_kv_cache_multi passes

* fix is_realized

* fix JIT for multi

* just a few more lines i'll pay them back soon i swear please bro just a few more

* no split reduceop for multi
2025-01-24 13:28:55 +09:00
chenyu
eb77488f85 update llama3 70B to use R1 (#8733) 2025-01-23 19:06:05 -05:00
George Hotz
3e987fc856 add device print with -m tinygrad.device [pr] (#8729)
* add device print with -m tinygrad.device [pr]

* fix linter
2025-01-24 05:46:27 +09:00
geohotstan
04846b91aa reorder and categorize onnx_ops (#8731)
* new order

* remove a todo

* constant node is definitely requires_grad false

* one new line spacing

* property and graph

* oops linter
2025-01-23 13:18:54 -05:00
qazal
8e5bd0cd7a fix buffer init and skip test_swizzle_failure_permute [pr] (#8732)
* fix buffer init and skip test_swizzle_failure_permute [pr]

* replace preload with just load

* add
2025-01-23 17:21:38 +02:00
nimlgen
e4512baea4 am: cleanup mm (#8730)
* am: cleanup mm

* cle

* ops

* entries
2025-01-23 15:49:37 +03:00
qazal
07ec99001a keep VIEW in big_sink + copy of buffer view spec [pr] (#8727)
* keep views in sink [pr]

* tests

* things from the gpt2 bug
2025-01-23 11:29:30 +02:00
qazal
6cb74bb630 fix using clone with shrink [pr] (#8724)
* fix using clone with shrink [pr]

* remove extra arg, add test_clone_with_shrink_realized
2025-01-23 08:28:07 +02:00
chenyu
af65331b76 update beam params for bert green [pr] (#8726)
increase BEAM_UPCAST_MAX and BEAM_LOCAL_MAX to default and matched red. 3% faster step
2025-01-22 22:00:05 -05:00
qazal
907dfa0e82 image buffer realization spec [pr] (#8420)
* image buffer realization spec [pr]

* redo the spec

* work
2025-01-22 20:25:22 +02:00
chenyu
49b914ee69 simpler bert acc [pr] (#8714)
logit.log_softmax().argmax(-1) is equivalent to logit.argmax(-1)
2025-01-22 10:32:19 -05:00
nimlgen
93fb50ce77 allreduce: add flags (#8713) 2025-01-22 17:44:31 +03:00
qazal
891436853d remove buffer size check in schedule item [pr] (#8712) 2025-01-22 13:36:30 +02:00
qazal
2dae467b75 scheduler + process_replay import cleanup (#8711) 2025-01-22 12:44:07 +02:00
qazal
e3d1464ba4 move assign preload out of schedule item [pr] (#8710)
* move assign preload out of schedule item [pr]

* fix that
2025-01-22 12:43:57 +02:00
chenyu
9a9079118e envvar BERT_LAYERS [pr] (#8709)
default is 24 for large
2025-01-21 22:49:19 -05:00
chenyu
9f6d545a16 bert log global_norm in training step [pr] (#8708)
* bert log global_norm in training step [pr]

and minor cleanups

* .item()
2025-01-21 20:36:27 -05:00
nimlgen
c5e46c5eee am: recover from any boot interrupt (#8703)
* am: recover from any load interrupt

* add fuzzer

* nu
2025-01-21 22:22:23 +03:00
chenyu
1e283c33d3 remove realize in bert model init [pr] (#8707) 2025-01-21 14:11:03 -05:00
George Hotz
018edd934b don't use view in copy [pr] (#8704)
* don't use view in copy [pr]

* oh, remove double contig

* fix reps
2025-01-21 09:57:47 -08:00
qazal
d6bf1feaab remove the "no copy" line from copy_to_device (#8702)
* delete the no copy one

* add tests
2025-01-21 17:09:33 +02:00
nimlgen
3628f89929 fix deallocate for subbuffers (#8701)
* fix deallocate for subbuffers

* forgot this

* rm name

* hmm
2025-01-21 16:34:19 +03:00
nimlgen
6733a3a96b am: fix typo (#8700) 2025-01-21 14:35:15 +03:00
qazal
f0d424ecdf Tensor UOps can become a buffer or const after scheduling (#8698)
* spec

* work

* update test_viewed_consts_do_not_realize

* remove
2025-01-21 12:33:19 +02:00
qazal
e2008c98c3 allow symbolic shape in tensor const parents [pr] (#8699) 2025-01-21 12:01:25 +02:00
nimlgen
2b239db5d2 temp() with usernames (#8697) 2025-01-21 12:26:43 +03:00
qazal
66ac0087e8 more high level contiguous tests + scheduler deletions [pr] (#8695)
* delete those

* move the upat too

* rename ops_folding to just sym

* keep that
2025-01-21 01:52:58 +02:00
qazal
08eb1f1f56 simplify tensors before scheduling [pr] (#8580)
* delete forced_realize

* put that back

* work

* remove forced_realize

* expectedFailures

* contiguous(buffer)

* multi

* expectedFailures

* cleaner create_subbuffer

* more comments

* remove that

* note

* realizes

* work

* one upat and image is back

* remove

* cleaner

* fix test_complex_backward for now

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2025-01-20 23:42:42 +02:00