tinygrad/extra at 7086d77db1e06fafe44cd5bc2d678b82d3051264 - tinygrad - AtHeartEngineering

github/tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Files

History

Yixiang Gao 13e872b53f add mutigpu support for llama attention (#3064 )

* add llama attention test for multigpu

* test fails

* kv cache trying to shrink on sharded axis

* mask None works for scale dot product

* kv cache seems to be working but scale dot product breaks

* scaled dot product works, but the last linear layer failed

* running into the reshape case where it could be wrong for multigpu

* making sure it was the reshape

* adding contiguous doesn't solve

* need to shard more properly

* remove reshape test

* minor adjustment to scale dot product attention test

* weights are sharded wrong

* continue fix new weight sharding

* clean up

* fix attention when start_pos is 0

* remove print

* add TODOs for the best mutigpu interface

2024-01-11 16:31:02 -08:00

..

move things, clean up extra (#2292 )

2023-11-13 20:18:40 -08:00

move dtypes to dtype.py (#2964 )

2024-01-01 14:58:48 -08:00

webgl backend in extra (#3041 )

2024-01-08 09:29:13 -08:00

regenerate kernel ast dataset (#2968 )

2024-01-01 20:26:17 -05:00

hip & cuda to gpuctypes (#2539 )

2023-12-01 09:25:27 -08:00

remove ACCUM_FP32 in simple_matmul.py (#3045 )

2024-01-08 17:37:57 -05:00

disk_read_speed example

2024-01-04 13:59:43 -08:00

coder.py can write and run code (#2439 )

2023-11-25 12:27:54 -08:00

add mutigpu support for llama attention (#3064 )

2024-01-11 16:31:02 -08:00

Revert "track size in shapetracker" (#3043 )

2024-01-08 13:13:39 -08:00

qcom_gpu_driver

start Qualcomm GPU driver (#2804 )

2023-12-16 23:10:50 -08:00

archprobe.py

move dtypes to dtype.py (#2964 )

2024-01-01 14:58:48 -08:00

augment.py

[ready] Replacing os with pathlib (#1708 )

2023-08-30 10:41:08 -07:00

autopad.py

fix PADTO optimization (#2935 )

2023-12-25 22:52:49 -05:00

disk_read_speed.py

fast hip read (#3014 )

2024-01-05 10:33:13 -08:00

dump_cache.py

wow how did i think that was okay (#2339 )

2023-11-16 21:21:11 -08:00

export_model.py

webgl backend in extra (#3041 )

2024-01-08 09:29:13 -08:00

gradcheck.py

Fix: Jacobian tests [WIP] (#1126 )

2023-07-05 15:36:22 -07:00

introspection.py

move globalcounters to ops (#2960 )

2024-01-01 14:21:02 -08:00

lr_scheduler.py

make LR scheduler work with multigpu (#3011 )

2024-01-04 12:10:56 -08:00

multitensor.py

multitensor start (#2676 )

2023-12-07 17:07:05 -08:00

onnx_ops.py

touchup onnx xor and not (#3008 )

2024-01-04 02:02:42 -05:00

onnx.py

removed redundant dtype hacks in onnx_ops (#2939 )

2024-01-04 01:45:24 -05:00

thneed.py

new style device (#2530 )

2023-11-30 17:07:16 -08:00

to_movement_ops.py

Revert "track size in shapetracker" (#3043 )

2024-01-08 13:13:39 -08:00

training.py

hotfix: examples/transformer.py

2024-01-09 19:28:09 -08:00