Files
tinygrad/test/test_linearizer.py
Timmy 720c700a8a Multireduce-Kernels: Linearizer Changes and Tests (#4259)
* basic tests

* cleanup

* pylint

* ruff

* use define acc as a proxy for rendered reductions

* use define acc as a proxy for rendered reductions

* recursive reduceop rendering via ast_parse

* linters + cleanup

* fixing late buf loading

* plus linters

* removing extra line

* linters

* does this break ci?

* added tests and if add end change

* typo in add_ends

* linters

* removing comments

* allow endifs to be inserted before the end of the graph

* find add ENDIF before next BARRIER

* removing tests with manual ENDIF + linters

* specifically the next barrier aftr the store of the local result

* Revert "specifically the next barrier aftr the store of the local result"

This reverts commit b288a5c3ce.

* keeping up to date

* linters + merge changes

* cleaning up old bad decisions

* linters and opts

* mrged linearizer tests

* fixing merge issues

* removing the big ugly uop test (functionality tested end-to-end by test_linearizer additions

* small diff fixes

* updating linearizer to work without uops.add( ... cachable)

* linters

* comment in multireduce tests

* skipping tests without locals

* full tests

* linters

* load_cache[key] fix for multiple accs

* linters

* assert only one reduceop

* fix loop_scope test to actually cause an issue

* self.load_cache[key] key for DEFINE_ACC changed to use a string to make sure each acc is unique

* updated tests

* fixing merge

* removing debug prints

* complete merge fix

* linters

* diff cleanup

* adding tests in

* give each reduce it's own local buffer

* gpu=1 changes

* store and load locals with upcasting

* modifying test?

* make multireduce_netsted_local_upcast test match single reduce shapes

* removing todo

* cleaning up the diff

* unroll test

* unroll and upcast tests

* fix gpu

* seq and self.load_cache[key] cleaning

* linters

* padto works

* merge fixes

* fixes

* add skips for amd

* linters + seq

* cleaning & more tests

* softmax tests

* linters

* [run_process_replay]

* add new tests back

This reverts commit 19dec22e01.

* more hardcoded -1s

* fix ptx

* Fix name for loop in ptx

* cleaning up the diff

* cleaning up the uops diff

* nv ci is too slow

---------

Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: Szymon Ożóg <58388001+SzymonOzog@users.noreply.github.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2024-06-12 13:29:43 -04:00

106 KiB