Commit Graph

848 Commits

Author SHA1 Message Date
chenyu
bfbd7c5461 more generic UOp mul mod folding (#5765) 2024-07-27 20:20:35 -04:00
chenyu
80c6475757 update test_uop_symbolic to test UOp min and max (#5764)
covers #5750, #5748, #5741
2024-07-27 19:53:21 -04:00
chenyu
dc7483ee6f UOp simple div folding (#5740)
made UOp.divides return the Optional[quotient] and used it for simple div folding
2024-07-26 17:14:32 -04:00
chenyu
a4e9ebc68a update test_uop_symbolic (#5733)
enabled more passed tests
2024-07-26 13:46:09 -04:00
chenyu
2cc55a3095 UOp simple mul add div fold (#5726) 2024-07-25 22:00:30 -04:00
chenyu
5521b6d437 UOp simple mul-add-lt fold (#5721) 2024-07-25 20:49:38 -04:00
chenyu
845b0d1c9d UOp more generic div folding (#5722)
old: `x // c` can fold if `0 <= x.vmin <= x.vmax < c`
new: `x // c` can fold if `0 < c and x.vmin // c == x.vmax // c`
2024-07-25 17:49:14 -04:00
chenyu
46e1151c02 UOp more generic mul -> mod folding (#5698) 2024-07-24 21:41:25 -04:00
chenyu
66a9c372af UOp mod reduction (#5697) 2024-07-24 20:36:00 -04:00
chenyu
8648fb2636 UOp vmin/vmax on ADD (#5689) 2024-07-24 19:09:42 -04:00
chenyu
85710e86cb UOps div folding (#5690)
#5689, with just div folding and new test cases
2024-07-24 14:21:44 -04:00
chenyu
a7a77dfd83 UOp mul lt fold (#5677) 2024-07-24 02:49:25 -04:00
chenyu
4e85761d40 UOp mod folding (#5668) 2024-07-24 00:10:47 -04:00
chenyu
199b3bf02b simple UOp lt/ge folding (#5657)
works if lhs is a DEFINE_VAR.
folds trivial x < -math.inf now, need to change SPECIAL to use DEFINE_VAR to fold more
2024-07-23 14:11:05 -04:00
chenyu
e210c87b4a uop mod-mod simplification (#5650) 2024-07-23 12:33:55 -04:00
chenyu
4f83da626e uop symbolic simple mul mod (#5648) 2024-07-22 23:17:41 -04:00
chenyu
f2d2afdaa4 dumb linearizer example that max is not simplified (#5644)
* dumb linearizer example that max is not simplified

this might just get fix once basic mod simplification is done

* need local
2024-07-22 18:37:26 -04:00
chenyu
97b116bb1d UOp mul div simplification (#5637)
* UOp mul div simplification

* != 0 is fine
2024-07-22 16:14:12 -04:00
chenyu
92e7e65712 one more test case for symbolic mod mul (#5615) 2024-07-20 17:23:06 -04:00
George Hotz
2e617ca59e lowerer img index (#5592) 2024-07-19 14:22:02 -07:00
George Hotz
2de82b8a5d remove get_lazyop_info (#5570)
* don't use get_lazyop_info more

* keep that min

* no ptx for that test
2024-07-19 03:05:33 -07:00
wozeparrot
90f0e2fc49 db in wal mode (#5388) 2024-07-12 20:43:36 -07:00
George Hotz
03c2dc8bd7 lowerer is kernel [run_process_replay] (#5437) 2024-07-12 18:50:55 -07:00
George Hotz
870dc8c350 s/Linearizer/Lowerer [run_process_replay] (#5428) 2024-07-12 15:54:07 -07:00
George Hotz
6707c778d0 scheduleitem is not Tuple [run_process_replay] (#5425)
* scheduleitem is not Tuple [run_process_replay]

* fix tests

* fix op + fuzzers

* fix mop test
2024-07-12 15:13:19 -07:00
George Hotz
d13654a820 move uopgraph to file [run_process_replay] (#5364)
* move uopgraph to file [run_process_replay]

* fix print tree test
2024-07-10 17:34:50 -07:00
chenyu
649641a2f2 fix tqdm with generator without __len__ (#5238)
it should be treated as total = 0 (just show iteration count).
also removed duplicated ": " in fetch and fixed unit scale with total = 0
2024-06-30 12:20:59 -04:00
chenyu
fd53b6d901 tqdm supports fractional blocks (#5233)
enabled progress bar match in test, it matched perfectly now
2024-06-29 22:30:18 -04:00
chenyu
ae10ae4722 simplify tqdm scale math (#5231)
expand the log of log stuff
2024-06-29 21:17:40 -04:00
chenyu
b2ea610df8 fix tqdm unit_scale and support hours in time (#5227)
* fix tqdm unit_scale and support hours in time

previously it only supports MM:SS.
more chars to unitscales, strip trailing "." and " " in formatting, and more tests

* simpler
2024-06-29 14:48:51 -04:00
chenyu
42d1f92fc1 simpler tqdm (#5221)
can do more, but many cases are not tested
2024-06-29 07:41:46 -04:00
Roelof van Dijk
9704c7d4d4 ruff rule if-exp-instead-of-or-operator (FURB110) (#5178)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-27 08:22:19 -07:00
Roelof van Dijk
975b811ad9 names shadowing builtins (#5179)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-27 08:15:01 -04:00
chenyu
33211f356b fix desc in tqdm (#5107)
per doc `https://tqdm.github.io/docs/tqdm/`, user does not need to put `: ` in desc, and `: ` is automatically removed after desc if the latter is empty.

updated test cases and added a test for set_description
2024-06-22 19:00:38 -04:00
chenyu
e356807696 tinytqdm.set_description and tinytrange (#5101) 2024-06-22 14:45:06 -04:00
chenyu
8080298739 s/tinytqdm/tqdm (#5103)
except in unit test where tqdm is imported
2024-06-22 14:18:26 -04:00
nimlgen
fb1bf48cfe io_uring for copies from disk (#5035)
* exp uring

* fixes and old version

* nv

* cleaner

* cmp vs aio

* fix

* no lib

* fix nv

* linter

* disk_speed_test now runs default

* fixes

* uring -> io_uring

* linter happy

* get_temp_buf comment added

* tiny nits

* put wait back

* test runs everywhere

* remove consts

* remove mmap consts

* do not require iouring to run test, they are generic
2024-06-21 11:36:51 +03:00
chenyu
4e5add4d01 move test_tqdm to test/unit/ (#5042) 2024-06-18 17:41:39 -04:00
Junjun Dong
c8cd6e725c Remove BinaryOps.SUB. Replace SUB by ADD and NEG in all tests. Regenerate dataset (#4977)
* feat: remove BinaryOps.SUB

* remove SUB in test_early_end_local

* regenerate dataset. remove SUB in test_linearizer_*

* reenable overflow tests

* simplify tensor.sub function by returning a+(-b)

* remove whitespaces

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-18 09:06:13 -04:00
chenyu
acaf9a490d RECIP(-0.0) should be -inf (#5024)
* RECIP(-0.0) should be -inf

added test_dtype_alu for PYTHON backend

* catcht that

* fix those two
2024-06-17 22:26:58 -04:00
George Hotz
1d6f1a15e1 add lt and ge uop methods [run_process_replay] (#4995)
* add lt and ge uop methods [run_process_replay]

* more correct (should still run process replay)
2024-06-16 09:33:53 -07:00
George Hotz
dac96f177e ignore indexing in the flopcounter (#4993) 2024-06-16 08:59:55 -07:00
chenyu
50bc14d186 re-enable test that loads torch pkl format (#4986) 2024-06-15 14:11:30 -04:00
wozeparrot
8209cd3c55 easier llama3 + fetch subdir (#4938) 2024-06-14 13:47:27 -07:00
chenyu
5eee974b2a construct Tensor from python list/tuple directly (#4947)
* construct Tensor from python list/tuple directly

no numpy. annoying that half memoryview is 3.12 feature...

* simpler, and test

* flat already

* simpler

* cute

* 10% faster

* 5%
2024-06-14 11:36:05 -04:00
Jhenner Tigreros
dc9e9e4363 Convert BinaryOps.DIV to UnaryOps.RECIP and BinaryOps.IDIV (#4887)
* Create UnaryOps.RECIP and BinaryOps.IDIV and changing uses of BinaryOps.DIV

* Delete unused import

* Add cstyle renderer

* Fix formatting text

* Fix test error due to bad implementation of renderer

* Add PTX support

* Add RECIP to LLVMIR

* Remove BinaryOps.DIV from symbolic test

* Change some test and fix C floor division

* Change references to DIV for the RECIP or IDIV

* Add mimic idiv for symbolic test

* Restore floor

* Mimic idiv

* cast to int

* Fix some test and renderer

* Remove DIV for render nodes

* Resolve issue with div

* Add TestRenderer

* Fix test

* fix error

* Fix PAD test

* Fix div implementation

* Remove DIV

* Add upcast to rshift, due to use of MUL and RECIP on DIV

* Fix linter

* Remove complete BinaryOps.DIV

* Fix lint

* Fix some test

* Revert mul modification

* Fix tests

* Fix CLANG for uops

* Revert IDIV function

* Minor fix

* modify pattern matching rule to support nan

* Fix UNSAFE_PADS_OPS to add UnaryOps.RECIP

* Remove const folding for IDIV and fix PTX

* Complete remove IDIV from extra

* Remove test_div from TestFloatUOps due to test on recip

* Fix linearizer

* fix

* Fix test_22

* Fix llvm

* Apply trunc function for llvmlit

* use floor instead of trunc

* Use correct type

* Generate new fuzz db

* Fix rshift, do not cast to float to support idiv

* Return upcast=false to rshift

* Add to unsafepad BinaryOps.IDIV

* Remove RECIP override for CUDA

* add atol / rtol for the test

* Remove cast to int on IDIV

* Regenerate sops

* delete sops.gz

* regenerate

* regenerate

* regenerate

* Reduce margins

* pass atol and rtol as parametersg for _test_metrics

* regenerated dataset

* Regenerate

* Remove duplicated

* Revert changes on extra

* Remove changes extra and NOQA for test

* Remove E501

* Remove and change line

* Remove E501

* Fix atan2

* Revert import and E501

* Remove E501

* Add hrcp to halp ops

* Remove 1 of hrcp

* Remove last DIV and add type check on uops for IDIV

* Fix new tests

* Fix tests and custom function

* Regenerate dataset

* Regenerate dataset

* Revert dataset

* Change generate dataset script

* Remove line

* Change IDIV, type checker validate if x,y and z are int

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-06-14 02:43:46 -07:00
George Hotz
9a3c1e4a17 fix mul div failure (#4928) 2024-06-12 13:58:46 +02:00
George Hotz
11a03cbbf5 don't use uops.add while constructing (#4913)
* don't use uops.add while constructing

* rebase

* bugfixes

* have to use BFS

* prove it's late

* simpler uop symbolic test (why we did this)

* use dict, not set
2024-06-12 13:31:34 +02:00
George Hotz
b9afb0d577 test uop as symbolic (#4870)
* start work

* more tests passing

* more tests passing

* more

* 34 failures

* expect the failures

* remove broken rule

* render is fine in just the test

* simplify and put in test
2024-06-09 12:15:11 +02:00
David Hou
cddce0e168 don't cast before view on shape changing bitcast (#4833)
* don't cast before view on shape changing bitcast

* make sure cast before view triggers
2024-06-04 16:04:52 -04:00