Commit Graph

286 Commits

Author SHA1 Message Date
George Hotz
fbf17f0031 intel benchmark matmul gets 60 TFLOPS? 2023-06-04 17:01:50 +00:00
Steven Anderson
657e642e3a Fixed test suite for Clip (#912)
* Fixed test suite for Clip

* fixed issue with clip when taking large negative numbers as min

* Remove typings
2023-06-04 09:01:01 -07:00
George Hotz
afd0be8a9c intel example 2023-06-04 06:43:09 +00:00
George Hotz
ed1963b899 Fast DiskTensor to other Tensor (#916)
* make disktensors fast

* loading

* loader for sd and llama
2023-06-03 12:25:41 -07:00
George Hotz
791530045d Refactor LoadOps (#910)
* test

* work

* upd test

* loadops

* cleanups

* real ones

* remove LazyNumpyArray

* fix assign test

* remove range

* np.require

* llama uses arange kernels

* no caching consts

* fix enet

* torch load support

* tests cleanup

* fix shufflenet

* fix image

* fix torch_load test
2023-06-03 09:40:43 -07:00
Steven Anderson
513aeb2f66 Fixed all ConstantOfShape test suite (#907) 2023-06-02 11:26:40 -07:00
Steven Anderson
301f7b54c6 ConstantOfShape ONNX test fixed. (#890)
* ConstantOfShape ONNX test fixed.

* removed redundant if statement

* value is optional and should default to a float32 tensor with value of 0

* fixed: default parameters are created at function definition, bad for mutable objects.
2023-06-02 07:34:25 -07:00
kposborne2
ae83e9844c add output_padding to transposed conv (#875) 2023-06-01 00:03:22 -07:00
Friedrich Carl Eichenroth
740304ef9d Small Onnx Parser Improvements (#885)
* wip

* rename onnx_version to onnx_model_versioN

* add type

* add types

* small cleanup

* revert some changes from before

* add todo

* dumb fix
2023-06-01 00:01:01 -07:00
Marcello Fuschi
3924aae8ed Fix ONNX dropout and unify the implementation (#857)
* Fix ONNX dropout and unify the implementation

* Use tensor rand method for dropout

* Change approach for RNG in ONNX Dropout

* Fix style

* Test legacy RNG seeding

* Remove the necessity for legacy RNG in Tensor class
2023-05-31 07:40:47 -07:00
skobsman
2e393f7ef2 InstanceNormalization ONNX test fixed. (#870) 2023-05-30 16:07:44 -07:00
Friedrich Carl Eichenroth
f91f28d9e2 fix a bunch of tests (#856) 2023-05-29 17:48:26 -07:00
zk-tarts
174c65b7d9 add onnx Binarizer op (#850)
Co-authored-by: zk-tarts <>
2023-05-29 13:15:50 -07:00
M4tthewDE
4408c25e9a Add Onnx op Shrink (#851)
* Add onnx Shrink operation

* Fix soft/hard shrink onnx test
2023-05-29 13:15:39 -07:00
Friedrich Carl Eichenroth
6f2b3755ca set axis default to 0 (#854) 2023-05-29 13:15:28 -07:00
Friedrich Carl Eichenroth
3b158f7a5f fix onnx versions greater or equal 10 (#853) 2023-05-29 13:04:06 -07:00
Diogo
1a5d72f812 Onnx ops And, Or, Xor, Not (#847)
* onnx and, or, xor, not

* added bool type to llvm and clang

* removed float conversion

* switched where op to use tensor func
2023-05-29 11:09:20 -07:00
SnakeOnex
844e6d0753 conv1d & conv3d onnx tests (#835)
* conv1d onnx

* [Work in progress] conv1d + enforcing full padding tuple length

* make ONNX padding reorder not hardcoded, works for 1D and 3D convs now

* conv2d interprets padding based on the input tensor dimensions
2023-05-29 10:16:45 -07:00
Marcello Fuschi
6d49925a26 Add max_pool2d dilation (#833) 2023-05-28 15:16:48 -07:00
cheeetoo
21d27d31a9 Fix a couple pad tests (#827)
* fix pad bug

* float type hint for value

* convert pads to list

* update Pad type signature

* Change | to Union since not supported in < python 3.10
2023-05-28 12:06:46 -07:00
Mattis Megevand
606b841d3f LR Schedulers (#755)
* lr schedulers + test

* lr scheduler test moved + integration test

* integration test for all lr scheduler

* lr scheduler test now deterministic

* changed optimizer + parameters for lr sched test
2023-05-27 07:47:49 -07:00
George Hotz
87fa5af70a ptx example 2023-05-26 19:28:51 -07:00
George Hotz
26014a0fa1 add convtranspose (#809)
* add convtranspose

* onnx convtranspose
2023-05-26 12:35:03 -07:00
wozeparrot
7351eb4b61 feat: put temperary file in the same directory as the destination file (#805) 2023-05-25 20:46:02 -07:00
Diogo
c19ef0fcce Add sin/cos/tan (#794)
* added sin/cos/tan

* fix lint

* added onnx ops support
2023-05-25 09:04:56 -07:00
George Hotz
0400315078 Revert "ops rdna"
This reverts commit 81a11d891d.
2023-05-21 13:02:18 -07:00
George Hotz
325a3bf2cf Revert "writing 2"
This reverts commit dddd6c42f0.
2023-05-21 13:02:17 -07:00
George Hotz
dddd6c42f0 writing 2 2023-05-21 12:52:36 -07:00
George Hotz
81a11d891d ops rdna 2023-05-21 11:45:38 -07:00
George Hotz
90fff82c8a Rdna (#776)
* assembler maybe

* custom asm

* rdna3 on quiet

* trigger crashes

* fixed notes

* non-fatal rdna2 crash

* Crash4

* improve rdna sniffer

* comments

* improve sniffer

* asm

* 131 TFLOPS RDNA3

* opt simple matmul

* todos
2023-05-16 05:33:57 -07:00
George Hotz
89b8b39d9c fix mypy 2023-05-13 21:25:36 -07:00
George Hotz
e0b2035023 fast imagenet eval, gets 76.14% across the set 2023-05-13 21:18:31 -07:00
George Hotz
46d419060b start on mlperf models 2023-05-10 16:30:49 -07:00
George Hotz
cb7c22beeb fix mypy 2023-05-06 19:18:54 +00:00
George Hotz
5190037cbc rocm: disassembler for shader 2023-05-06 19:07:52 +00:00
George Hotz
42256c0d9d rocm sniffer dumps code 2023-05-05 18:36:53 +00:00
George Hotz
f2a964f447 nocopy (#764) 2023-05-05 09:32:06 -07:00
George Hotz
3a2011ab2d rocm sniffer 2023-05-04 22:22:39 +00:00
George Hotz
a55c4f5000 better rocm build scripts 2023-05-04 09:14:05 +00:00
George Hotz
987b1aaf96 rocm build scripts 2023-05-04 08:45:23 +00:00
George Hotz
ed33a89d52 no werror in archprobe 2023-05-03 19:34:17 +00:00
George Hotz
7ecf4dff68 multi cl_queue (#762)
* multi cl_queue

* only platforms 1

* gpus first, then cpus

* put device on underlying buffer

* cl_queue array
2023-05-03 12:15:28 -07:00
George Hotz
3b933b0a2f rocm setup script 2023-05-03 16:01:17 +00:00
George Hotz
59d0d168cd FLOAT16 off works 2023-04-19 15:34:56 -07:00
George Hotz
3d15769a8f 50 TFLOPS cuda matmul 2023-04-19 14:38:24 -07:00
George Hotz
0b5a0b9ba4 winograd comment 2023-04-16 03:36:51 -07:00
George Hotz
8b777af571 metal_conv gets over 10.4 TFLOPS... 2023-04-15 03:31:22 -07:00
George Hotz
d66e682205 metal matmul from tcores branch 2023-04-14 23:29:29 -07:00
Sohaib
70b9072663 add Pad onnx operator and rework _padding (#740) 2023-04-06 17:07:36 +05:30
George Hotz
94e2c49c35 test_cacheline_size that works in both places 2023-03-30 06:47:20 +04:00