Commit Graph

34 Commits

Author SHA1 Message Date
George Hotz
6707c778d0 scheduleitem is not Tuple [run_process_replay] (#5425)
* scheduleitem is not Tuple [run_process_replay]

* fix tests

* fix op + fuzzers

* fix mop test
2024-07-12 15:13:19 -07:00
George Hotz
f6ef283e6a s/loadops/metaops [run_process_replay] (#5421) 2024-07-12 13:26:50 -07:00
chenyu
46a793111b test for LazyBuffer._view when mask out and degrade into const (#4465)
changed the condition from all 0 in masked dims to any 0 in masked. it's no-op because shapetracker rewrites whole mask to 0 if any dim has 0 as part of canonicalization
2024-05-07 12:56:23 -04:00
George Hotz
68ca4d4276 split to schedule.py (#3949)
* split to schedule.py

* split
2024-03-26 21:02:46 -07:00
George Hotz
150ea2eb76 create engine folder and move code (#3948)
* retry

* older tf

* that
2024-03-26 20:38:03 -07:00
qazal
337cd53444 multioutput ScheduleItem (#3699)
* refactor realize.py

* update docs

* update test_sched

* update runners and devices

* update openpilot and unit tests

* cleanup runner lowering

* update more tests
2024-03-13 08:59:38 -07:00
chenyu
2127c1c6c2 test for the split reduce kernel (#3515)
somehow this was not tested
2024-02-27 21:29:25 -05:00
qazal
7919a1e6ec dtypes: delete the float cast in realize.py (#3401)
* remove float cast

* cast scalars to the correct value in creation time

* cast scalar in the correct place

* wrong, use y_dtype

* make consts have a unique cache key

* add cast_scalar back

* test_load_cache_const_bufs

* add bool dtype

* test_const_dtype

* fix linters
2024-02-15 14:20:30 +01:00
George Hotz
6a4a5dc79d fix pad 0 size (#3277)
* fix pad 0 size

* put in view, not pad

* test was wrong
2024-01-30 08:58:10 -08:00
George Hotz
ea5824657d move fromcpu out of lazy.py (#3122)
* move fromcpu out of lazy.py

* fix abstractions2
2024-01-14 18:21:08 -08:00
chenyu
a300fea2a4 failed test case due to cast resets shapetracker (#3109)
cast implicitly resets shapetracker and makes it contiguous (for disk tensor), which fails for Interpreted backend if inputs contain non-contiguous st.
2024-01-13 12:46:51 -05:00
chenyu
7086d77db1 bugfix do not reset shapetracker of 0 size lazybuffer (#3096)
it might be coming from an expand, and resetting results incorrect stride. caught by interpreted backend
2024-01-11 23:22:52 -05:00
chenyu
dab8214103 unit tests for Device.canonicalize (#3055) 2024-01-09 12:47:20 -05:00
George Hotz
7ea2e0035b move children for speed (#3047)
* move children for speed

* no children anymore
2024-01-08 15:02:32 -08:00
chenyu
81b97cd2c6 canonicalize device in LazyBuffer constructor (#2991)
fixed the multitensor +1 then sum bug
2024-01-03 12:55:25 -05:00
George Hotz
1765849937 new lazy, benchmark (#2878)
* lazy rewrite, try 2

* min fix tests

* pass contig test

* put broken pads back

* move that to realize

* no contig child fixes array packing

* so wrong

* now that's correct

* base children

* fix bind issues

* disable to_image_idx

* fix tests

* that failure shouldn't break other tests

* more fixes

* fix torch

* skip failing tests in CI

* 1e-7

* half is broken

* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
George Hotz
2c363b5f0b new style device (#2530)
* cpu tests pass

* torch works

* works

* metal works

* fix ops_disk

* metal jit works

* fix openpilot

* llvm and clang work

* fix webgpu

* docs are rly broken

* LRU works on metal

* delete comment

* revert name to ._buf. LRU only on Compiled

* changes

* allocator

* allocator, getting closer

* lru alloc

* LRUAllocator

* all pass

* metal

* cuda

* test examples

* linearizer

* test fixes

* fix custom + clean realize

* fix hip

* skip tests

* fix tests

* fix size=0

* fix MOCKHIP

* fix thneed

* copy better

* simple

* old style metal copy

* fix thneed

* np reshape

* give cuda a device
2023-11-30 17:07:16 -08:00
Christopher Mauri Milan
7f01dd04f0 Apply ruff linting rules to tests (#2473)
* everything except F821

* enable F821 with noqa

* dumb fix

* fix remaining imports and (former) lambdas

* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
George Hotz
9e07824542 move device to device.py (#2466)
* move device to device.py

* pylint test --disable R,C,W,E --enable E0611

* fix tests
2023-11-27 11:34:37 -08:00
chenyu
a753c8e071 examples of new GPT2 and JIT change (#2261)
* var_vals are global

* working with global ish

* better

* fix export model

* fix tests

* better kv cache

* does it run?

* use where for kvmask

* fix excessive var_vals

* fix import

* how does multigpu use this?

* llama kinda work

* faster and simpler

* cleanup

* fix conversation mode

* test cleanups

* fix one more test

* test cleanup

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2023-11-10 15:07:02 -05:00
George Hotz
de5d603ec1 corealize + remove realize from lazybuffer (#1968)
* corealize + remove realize from lazybuffer

* fix multigpu

* fix graph
2023-10-04 10:59:31 -07:00
George Hotz
d449b3bef1 think about removing realize from lazybuffer (#1965)
* remove realize from lazybuffer

* okay fine, back that off

* fix tests maybe

* fix test
2023-10-04 07:18:58 -07:00
George Hotz
6a79d4044a unrealized consts everywhere (#1963)
* unrealized consts everywhere

* don't import device from lazy

* Device isn't in Lazy

* same issue

* disable jit random
2023-10-04 01:48:10 -07:00
nimlgen
f04c1a63ae Rand works in jit (#1960)
* rand works in jit

* better jitted rand creation

* Update realize.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-10-03 12:55:25 -07:00
George Hotz
73a6ed7862 Apply ShapeTracker in interpreted backends (#1846)
* applying st

* tests pass

* minor cleanups

* torch too

* hack

* contiguous

* move mops

* contig in BN

* tests should pass

* make torch fast

* make zeros and ones contig by default

* no contig there

* fix padding with expanding

* might fix tests

* still doesn't fix bug, but should be there

* Revert "still doesn't fix bug, but should be there"

This reverts commit 8ea92f3e07.

* minor cleanups
2023-09-23 10:05:13 +08:00
nimlgen
1c0449e190 add cache collector (#1595)
* init cache collector

* add test_cache_collector.py

* switch GlobalCounters.cache to CacheCollector

* init jit models test

* jitted SD

* add debug msg to print loaded bufs count

* moved cache collctor to jit

* clearer SD

* no double device import
2023-08-28 19:59:55 -07:00
chenyu
a89142e46f ShapeTracker.var_vals (#1540) 2023-08-14 18:53:37 -07:00
George Hotz
d24f936501 just cmplt (#1493)
* just cmplt

* fix maximum

* don't save, there's no backward

* ugh, no slot either

* eq is a scam
2023-08-08 13:58:10 -07:00
nimlgen
669b406ec6 correct children count with lazycache (#1429) 2023-08-05 00:30:16 -07:00
chenyu
18d0a93f09 LazyBuffer.get_variable_buffers() (#1391)
* LazyBudder.get_variable_buffers()

* remove left_only, add ProdNode

* no vars for OpNode.b

* do not change symbolic vars, remove ProdNode
2023-08-02 09:01:35 -07:00
Francis Lam
df86672bd4 Fix LazyBuffer SHUFFLE_PAD_OPS to prevent invalid pad movement (#1223)
In addition to div, any ops that will generate non-zero outputs from
zero inputs need to be guarded.
2023-07-11 15:30:35 -07:00
George Hotz
0ad99038ef Revert "Revert "Fix ShapeTracker mismatch in LazyBuffer.fromCPU (#1156)" (#1181)" + add test
This reverts commit a374b62bfe.
2023-07-07 18:37:04 -07:00
George Hotz
a374b62bfe Revert "Fix ShapeTracker mismatch in LazyBuffer.fromCPU (#1156)" (#1181)
This reverts commit 8ff7184b1b.
2023-07-07 18:29:05 -07:00
fluffy χατγιρλ
8ff7184b1b Fix ShapeTracker mismatch in LazyBuffer.fromCPU (#1156)
* init shape tracker with strides to fix mismatch

Author:    sekstini <sekstinilol@gmail.com>

* fix whitespace

* add tests
2023-07-07 18:28:21 -07:00