* trace buffer producer and consumers
* work
* generic colored util
* fix batched
* basic clicking works
* generic javascript that works for producer and consumers
* keep focused shape
* idle time
* timings for producer and consumers dedup
* from sd test
* tiny cleanups
* timeline
* work
* up to here
* assert
* list it
* work
* better viz names
* delete unused
* don't use opacity, it's multiplicative
* keep styles
* scrollbar coloring
* pyrender doesn't work here
beautiful_mnist r_64_16_32_36@lower all index dtypes
* nak works
* TestOps::test_add works
* testop has no crashes
* fix bool casts
* fix typo
* add disassemble
* RANGE and locals/regs
* simplify NAKCompiler
* disass cleanup
* cleanup nir codegen
* almost all tests passing
* cleanup notes in extra/
* old notes
* only import nak if NIR=1
* fix new SPECIAL syntax
* fix local/shared memory
* more tests passing
* add DEFINE_VAR support
* llvmpipe kinda works
* diskcache
* some mypy stuff
* lvp passing test_ops.py
* fix imports
* actually fix imports
* remove 'stdout'
* fix llvm import
* fix mypy issues
* nicer errors
* simpler test_dtype skips
* test lvp in CI
* fix github action syntax
* fix more actions typos
* switch to mesa 25.1.0
* diskcache_put
* better generation for lvp nir_options
* b64encode shader blobs
* Revert diskcache changes
This reverts commits 930fa3de8a and 8428c694b3.
* general cleanup
* better error messages
* fix llvm import
* fix windows tests
* link with libm and libgcc_s
* fix some errors
* dont check for 'float4'
* NIR uses pointer arithmetic
* use tinymesa
* bump tinymesa
* bump tinymesa again
* update lvp nir_options
* print nir shader with DEBUG
* simplify LVPCompiler
* more tests
* "gated" STORE
* NAK is cacheable
* more tests
* all tests pass locally for NAK
* test autogen in CI
* autogen deps
* more deps
* fix uop_gc
* fix macos
* mypy
* save 2 lines
* save two more lines
* save 1 line
* save 4 lines
* save more lines
* Revert "save more lines"
This reverts commit dd3a720c5a.
* save more lines
* fix LVP on windows
* refactor
* reorganize some code
* refactor lib_gpu
* move LVP check
* out of order loads
* remove support.mesa
* bump tinymesa version
* simplify LVP jit
* macos
* macos ci
* shell: bash
* testing
* more testing
* compute brew prefix
* stupid typo
* actually fix
* lib
* stdout on macos
* inline gallivm_compile_module
* Revert "inline gallivm_compile_module"
This reverts commit b65983b151.
* elf macos
* semicolon
* inherit from CPULLVMCompiler
* ruff
* disas test
* fix libm linking
* default is fine actually
* arm works
* add elf loader link test
* fix NAK beam
* pylint is too smart by half
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
* work on shape property
* reshape causing issues
* more mops
* all mops
* need to cache it
* _shape is like _device
* mostly works
* shape is good
* const uses _shape
* fix tests
* size doesn't use st
* close
* test is broken
* one less st
* hack for 3 op assign
* oops, i didn't mean to change that
* support emulate in the NullDevice
* reproed failure in emulation
* fix wmma
* test_schedule independent of RANGEIFY flag
* comment for expectedFailure + test_cast_padded_view
* test_cast_padded_const works
* don't use full_shape it's fine
* add todos for the rest
* remove restrictions on range ending in indexing
* early simplify
* Revert "early simplify"
This reverts commit 657d9972c2.
* disable const folding tests
* remove GroupOp.Meta and st_arg
* inline axis_arg
* only allow .buffer on reshapes (or the buffer)
* gate is the other way
* still want can_pad?
* use op_in_backward_slice_with_self
* .buffer is recursive
* lint
* pathlib there
* delete the old rangeify path and all the children stuff
* remove the on_stack stuff and any retries
* don't use the p word
* Revert "remove the on_stack stuff and any retries"
This reverts commit 49a2b328b9.