George Hotz
e15754db28
remove (some) kernelize from llama and test schedule speed ( #10939 )
...
* remove kernelize from llama
* 405B
* space
2025-06-23 15:07:31 -07:00
chenyu
3699d1d3ba
hotfix llama3 temperature is float ( #10938 )
2025-06-23 15:20:56 -04:00
uuuvn
4e2c9e36c7
Remote multihost (p2p transfer) ( #10601 )
2025-06-23 11:47:29 -07:00
chenyu
42b1c9625b
skip test TestKiTS19Dataset::test_training_set ( #10936 )
...
flaky
2025-06-23 14:27:24 -04:00
patrini32
9e9fd44987
refactor test/external/external_llama_eval.py ( #10567 )
...
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2025-06-23 10:43:20 -07:00
chenyu
785b4ea8ac
optim flatten().shape[0] is numel ( #10935 )
2025-06-23 13:11:19 -04:00
qazal
ac39f27ae6
viz: non blocking UOp tracing ( #10913 )
...
* viz: non blocking UOp tracing
* u.arg
* no if Ops.KENREL
* drop replace
* switch to weakref.WeakKeyDictionary
* back
* remove ram usage skips, viz works here
* cache on reconstruct
2025-06-23 19:59:28 +03:00
Ignacio Sica
b8d09a1dae
tc with group/grouptop ( #10903 )
2025-06-23 09:58:41 -07:00
qazal
9944c2c02d
viz: show time taken on hover ( #10934 )
2025-06-23 19:00:40 +03:00
George Hotz
1e99a7f1c9
hotfix: don't viz the indexing rewrites
2025-06-23 08:20:26 -07:00
chenyu
f9b59924f1
OPTIM_DTYPE to specify dtype for optim params ( #10925 )
...
one more flag
2025-06-23 10:32:03 -04:00
qazal
7820aeca8e
update codegen process replay to use get_program [pr] ( #10921 )
...
* update codegen process replay to get_program [pr]
* precommit
* try str replace
* +to_function_name
* fixup tc
* local2.sh
* fix openpilot NOLOCALS
* new local.sh
* correct merge
* beam cache
* back
* revert beam thing
* adding opts_override and name_override makes output of get_program
reproducible
* min diff
2025-06-23 17:31:41 +03:00
nimlgen
eceb7a00d2
nv: rename iface mem functions ( #10931 )
2025-06-23 16:34:51 +03:00
qazal
4e864bd304
fix: getenv("NOLOCALS")/NOLOCALS context var ( #10927 )
...
OptOps shouldn't rely on os.environ.
2025-06-23 11:23:59 +03:00
alpharush
22f9696522
Fix/hcqfuzz harnesss bug ( #10923 )
...
* update command so extra module is found
* fix empty range in randrange errors
* lint
2025-06-23 11:22:30 +03:00
qazal
f037f85532
s/getenv("TC")/USE_TC context var ( #10922 )
2025-06-23 00:39:45 +03:00
qazal
9201224e0b
viz: remove Kernel check [pr] ( #10920 )
...
* viz: remove Kernel check [pr]
* TestVizIntegration
* test/unit allows opening of devices
* kernel -> Kernel
2025-06-22 20:47:54 +03:00
nimlgen
3ccdb2356b
system: factor out PCIIfaceBase ( #10917 )
...
* system: factor out PCIIfaceBase
* linter
* typing
2025-06-22 20:03:14 +03:00
George Hotz
b09c47366f
opt transforms the ast into an optimized ast ( #10900 )
...
* opt transforms the ast into an optimized ast
* fix get_kernel order and to_function_name
* function_name property
* update docs
* copy from kernel.py
* improve docs
* ci didn't trigger?
2025-06-22 09:41:26 -07:00
qazal
ffddf165f8
viz: color by kernel names in profiler ( #10919 )
...
* viz: color by kernel names in profiler
* ellipsis stays in bounds
2025-06-22 18:07:52 +03:00
nimlgen
36536ef6f0
nv: minor changes from nvpci ( #10918 )
2025-06-22 18:04:39 +03:00
geohotstan
4ab7d792cc
ONNX improve dtype fallback ( #10800 )
...
* fix
* add early verbose demo test
* is this how to write tests :s
* is definition drift even a thing? gemini says it is
* clean up
* better
* even better
* try add to CI
* doesn't work quite yet
* much more work to be done
* whoops
* partition the test heh
* skipif
* some nits for better names
* add webgpu test for onnxrunner
* fix reference links
* flush for now
2025-06-21 19:29:45 -04:00
chenyu
0480139def
log_perplexity metrics ( #10912 )
2025-06-21 10:44:47 -04:00
nimlgen
0e7bd9fd03
factor out generic MemoryManager ( #10910 )
...
* allocator -> memory
* just moveout it
* mm is abstracted
* need entry abstraction
* fix
* mypy
2025-06-21 16:18:33 +03:00
qazal
c7ec913210
viz: cleanup unit tests ( #10909 )
...
* cleanup test_viz
* tree view
2025-06-21 12:35:09 +03:00
chenyu
1373071f19
simplify logcumsumexp ( #10908 )
...
clarify and remove some flatten and squeeze/unsqueeze
2025-06-20 22:56:42 -04:00
George Hotz
fa52bdb50f
applied_opts is in the optimized ast] ( #10906 )
2025-06-20 18:56:23 -07:00
chenyu
2d9c61e39e
test more dims in test_logsumexp and test_logcumsumexp ( #10907 )
...
refactoring squeeze and unsqueeze is easy to get wrong
2025-06-20 21:42:18 -04:00
Nino Risteski
3771cc0f77
fix test logcumsumexp broken devectorize=0 ( #10880 )
...
* fix test logcumsumexp numerical
* lint
* Use dtypes.min instead of -1e4
2025-06-20 20:54:50 -04:00
George Hotz
7636d2cdc5
flip order of get_program args ( #10905 )
2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04
move functions to view and update docs [pr] ( #10904 )
...
* move functions to view and update docs [pr]
* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3
move stuff to kernelize folder ( #10902 )
...
* move stuff to kernelize folder
* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
d399a4587d
move mem estimate to ProgramSpec [pr] ( #10901 )
2025-06-20 15:54:28 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
nimlgen
bb0299b9e5
system: shared pci logic ( #10894 )
...
* moveout pci logic
* fixes
* oops
* types
* more type
* one style
* thi is imp
2025-06-21 00:09:49 +03:00
nimlgen
c83fdc50d1
nv: driver iface ( #10895 )
...
* nv: driver iface
* fixes
* ops
* not used anymore
* fix mypy
* too long
* fix
* fixed
* mypy
* ugh, it's misc
* rename to NVK
2025-06-20 22:36:08 +03:00
George Hotz
fc9f883870
if upat returns self, it's none ( #10898 )
...
* if upat returns self, it's none
* fix pm tests
2025-06-20 12:11:19 -07:00
qazal
4f179b9ddb
viz: gate launch behind a ContextVar [pr] ( #10892 )
2025-06-20 17:30:32 +03:00
chenyu
3f29c7edda
minor onnx dropout cleanup ( #10891 )
...
we should consider removing numpy random and test it similar to test_randomness, unless how seed works is part of spec?
2025-06-20 10:18:34 -04:00
simone-pietro
e94ac6e20c
Cast ptr to int in test_from_mv_to_mv ( #10876 )
...
* Cast ptr to int in test_from_mv_to_mv
* Add type hints for from_mv
2025-06-20 14:52:34 +03:00
qazal
000eb30f04
viz: remove prev profiler file ( #10888 )
...
The new profiler is integrated in the main VIZ tab.
Will also delete perfetto.html after matching [final features](https://github.com/tinygrad/tinygrad/pull/10763#issuecomment-2980543715 ) soon.
2025-06-19 23:05:46 +03:00
chenyu
62a540066e
remove DEBUG=2 in mi300x bert setup ( #10886 )
...
seems fine now, not sure what the issue was
2025-06-19 13:28:53 -04:00
Nino Risteski
5a56710ff4
small fix replacing download_file with fetch ( #10877 )
...
* imported a missing os and replaced download_file with fetch from tg helpers
* use fetch directly
* Remove if not os.path.isfile
2025-06-19 12:12:09 -04:00
chenyu
8d721a4ead
add 405B params to llama3.py ( #10884 )
...
tested with `python examples/llama3.py --model /raid/weights/llama31_405b/ --size 405B --shard 8 --benchmark` on tinyamd2
2025-06-19 11:45:37 -04:00
chenyu
a3dae51085
lower test_gemm_8192 on red ( #10883 )
2025-06-19 10:01:25 -04:00
simone-pietro
36f01411a2
Pass list to block_reorder in test_loads ( #10881 )
2025-06-19 09:49:45 -04:00
chenyu
f377cc19cd
use AM for bert ( #10882 )
...
have triained 3 runs and all seem fine
2025-06-19 09:48:54 -04:00
borgwang
06ea74bf2c
fix-typos ( #10879 )
2025-06-19 09:13:31 -04:00
qazal
ac891b78f8
skip UOp del when python is shutting down [pr] ( #10847 )
2025-06-19 15:31:40 +03:00
simone-pietro
58252e3c49
Change type hint for init_c_struct_t and to_struct [pr] ( #10878 )
...
* Change type hint for init_c_struct_t
* Change type hint for to_struct
2025-06-19 13:22:44 +03:00