Number after .so is abi version, it is always 1 for libgcc_s.
Most linux systems set default library versions via symlinks that are
simply followed to get actual elf, however conda does it via linker
scripts which ctypes doesn't follow (below contents of libgcc_s.so):
```
/* GNU ld script
Use the shared library, but some functions are only in
the static library. */
GROUP ( libgcc_s.so.1 -lgcc )
```
ctypes.util.find_library thinks that this is the actual elf and
ctypes.CDLL just loads this text file as a shared library. The result
is:
```
File "/home/me/src/tinygrad/tinygrad/device.py", line 223, in CPUProgram
helper_handle = ctypes.CDLL(ctypes.util.find_library('System' if OSX else 'gcc_s'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniforge3/envs/tinygrad/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /home/me/miniforge3/envs/tinygrad/lib/libgcc_s.so: invalid ELF header
```
* LLVM JIT prereqs
This commit moves jit loading, disassembling and CPUProgram logic from
`ops_clang.py` to `elf.py`, `helpers.py` and `device.py` respectively
I don't quite like the `helpers.py` destination for capstone_flatdump
but this is where cpu_objdump is so presumably this is how it's supposed
to be
* Types
* only use BUFFER_VIEW in disk [pr]
* delete can_view
* BUFFER_VIEW op on DISK
* remove that allow_buffer_view=False
* notes
* bitcast is a low-level op too
* this passes on AMD and LLVM
* assert to prepare for grad uop [pr]
* fix test_nn
* fix most of test_tensor
* few more tests
* fix multi
* uniform gradient
* acc_dtype
* any for multi
* fix typing
* fix assert, CAST_BEFORE_VIEW is still the issue
* explict test for CAST_BEFORE_VIEW
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
* remove cast before view
* greener
* indexing
* delete view instant rule
* that passes too
* openpilot too
* ack
* base on cast_before_view
* add it as a rewrite rule
* VIEW(DEVICE) is also fine
* test_shard_memory depends on forced_realize removal
* put that back, will go soon
* UOp representations change once we don't instantly fold things
* do not duplicate tests
---------
Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
* init turing tc
* reorder tc
* hotfix: remove some spaces
* revert var name to x
* consistent order of factors
* revert order of terms to match old stuff
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
* is 67% considered fixed?
* move test up
* share function
* add qgemm too
* make sure qgemm comes out as int
* actually that note is not right
* remove qgemm (I did it wrong) and add it later lol.