Commit Graph

412 Commits

Author SHA1 Message Date
Francis Lam
f445e056ed wmma: add test and tensor core shape (#1925) 2023-09-28 18:04:28 -07:00
George Hotz
c907efbf4a reorder a few things (#1915)
* reorder a few things

* huh, that has to be there

* move apply shapetracker

* BufferOps

* only for type checking
2023-09-25 10:17:21 +08:00
George Hotz
7ff7aacdb4 LazyOp out of Linearizer (#1908)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx

* dedup buffers

* hmm, bugfix

* fix tensor cores

* opts device
2023-09-24 14:30:53 +08:00
George Hotz
97dc813329 Revert "All LazyOps in the Linearizer (#1905)" (#1907)
This reverts commit a5820390db.
2023-09-24 11:51:22 +08:00
George Hotz
a5820390db All LazyOps in the Linearizer (#1905)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx
2023-09-24 11:50:00 +08:00
nimlgen
31fca43706 kopt works with local+grouped reduce and tests (#1824) 2023-09-09 13:22:09 -07:00
George Hotz
ed194a1d3b zero fold (#1748)
* add constant fold

* err, it's just zero folding

* self store fold + caching

* prints and more folds

* simpler winograd kernels

* remove childless uops
2023-09-03 13:48:11 -07:00
nimlgen
1c0449e190 add cache collector (#1595)
* init cache collector

* add test_cache_collector.py

* switch GlobalCounters.cache to CacheCollector

* init jit models test

* jitted SD

* add debug msg to print loaded bufs count

* moved cache collctor to jit

* clearer SD

* no double device import
2023-08-28 19:59:55 -07:00
George Hotz
a6d842af7a move device to ops (#1646)
* move device to ops

* mlops types

* 2 lines
2023-08-23 08:30:17 -07:00
David Hou
4fbce972d7 CSE at uop level (#1483)
* uop-level cse

* add test

* don't cache reduce alu ops

* types

* rename variable

* fix

* delete lines
2023-08-19 23:40:40 -07:00
David Hou
92754e177c cache buffer loads across multiple bufs (#1482)
* cache loads across buffers (since they may share rawbufs)

* typing

* add test

* fix test

* small changes to test

* fix test

* one big cache

* whitespace

* golf a line?

* invalid is RawBuffer(0)[0], valid 1.
2023-08-19 09:09:58 -07:00
David Hou
56ee97b37f dedup kernel args v2 (#1272)
* new version

* fix abstractions

* try remove test

* Revert "try remove test"

This reverts commit 2fc18a9f8e.

* assert_allclose

* minimize the test

* minimize the test

* minimize the test

* minimize the test

* Revert "minimize the test"

This reverts commit e0c0929596.

* Revert "minimize the test"

This reverts commit 88240551b1.

* Revert "minimize the test"

This reverts commit 78328a7ce2.

* Revert "minimize the test"

This reverts commit 989523fded.

* skip test inside body

* oops

* oops
2023-07-18 20:03:42 -07:00