mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-02-11 15:15:13 -05:00
* basic tests * cleanup * pylint * ruff * use define acc as a proxy for rendered reductions * use define acc as a proxy for rendered reductions * recursive reduceop rendering via ast_parse * linters + cleanup * fixing late buf loading * plus linters * removing extra line * linters * does this break ci? * added tests and if add end change * typo in add_ends * linters * removing comments * allow endifs to be inserted before the end of the graph * find add ENDIF before next BARRIER * removing tests with manual ENDIF + linters * specifically the next barrier aftr the store of the local result * Revert "specifically the next barrier aftr the store of the local result" This reverts commitb288a5c3ce. * keeping up to date * linters + merge changes * cleaning up old bad decisions * linters and opts * mrged linearizer tests * fixing merge issues * removing the big ugly uop test (functionality tested end-to-end by test_linearizer additions * small diff fixes * updating linearizer to work without uops.add( ... cachable) * linters * comment in multireduce tests * skipping tests without locals * full tests * linters * load_cache[key] fix for multiple accs * linters * assert only one reduceop * fix loop_scope test to actually cause an issue * self.load_cache[key] key for DEFINE_ACC changed to use a string to make sure each acc is unique * updated tests * fixing merge * removing debug prints * complete merge fix * linters * diff cleanup * adding tests in * give each reduce it's own local buffer * gpu=1 changes * store and load locals with upcasting * modifying test? * make multireduce_netsted_local_upcast test match single reduce shapes * removing todo * cleaning up the diff * unroll test * unroll and upcast tests * fix gpu * seq and self.load_cache[key] cleaning * linters * padto works * merge fixes * fixes * add skips for amd * linters + seq * cleaning & more tests * softmax tests * linters * [run_process_replay] * add new tests back This reverts commit19dec22e01. * more hardcoded -1s * fix ptx * Fix name for loop in ptx * cleaning up the diff * cleaning up the uops diff * nv ci is too slow --------- Co-authored-by: qazal <qazal.software@gmail.com> Co-authored-by: Szymon Ożóg <58388001+SzymonOzog@users.noreply.github.com> Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
106 KiB
106 KiB