George Hotz
b086584d64
add vcount to PtrDtype ( #7388 )
2024-10-30 10:43:54 +08:00
uuuvn
06a8700bfa
Replace sqrtl (long double) with sqrt (double) for double ( #7366 )
2024-10-30 10:20:41 +08:00
gonutz
e7cbc6dc23
Fix ValueError in Yolo 8 example ( #7387 )
...
Calling
python3 examples/yolov8.py ./test/models/efficientnet/Chicken.jpg
used to result in this error
ValueError: Calling nonzero on 0d arrays is not allowed.
Using np.atleast_1d makes sure we avoid a zero-dimension array.
Co-authored-by: gonutz <gonutz@fake.mail >
2024-10-30 10:18:39 +08:00
chenyu
f389e1a8a0
test more special values for sin/cos/tan [pr] ( #7386 )
2024-10-29 21:13:37 -04:00
chenyu
33acbaeb24
reuse polyN in trig_poly float64 ( #7385 )
...
similar speed, less alu (151 v.s. 154 per sine) and simpler, the power of 2 thing should probably be done in polyN if needed
2024-10-29 20:45:56 -04:00
chenyu
6bf38c35e5
clean up transcendental frexp [pr] ( #7384 )
...
also added some unit tests for frexp
2024-10-29 18:51:37 -04:00
chenyu
99b82f5708
minor cleanup payne_hanek_reduction [pr] ( #7383 )
2024-10-29 17:59:18 -04:00
chenyu
f6abde95fa
clean up Tensor._reduce ( #7382 )
...
use make_tuple and self.ndim
2024-10-29 17:23:57 -04:00
nimlgen
4ed2c40d48
qcom a bit cleaner ( #7380 )
2024-10-29 23:50:28 +03:00
chenyu
07ad6d20ed
simpler commutative flipping condition ( #7377 )
...
`x.src[1].tuplize < x.src[0].tuplize` implies `x.src[0] is not x.src[1]`
also renamed cc -> op
2024-10-29 13:51:24 -04:00
chenyu
d3c192b056
Device method cleanup [pr] ( #7375 )
2024-10-29 12:49:47 -04:00
chenyu
f8a623b386
fix typing in Conv2d ( #7374 )
...
* fix typing in Conv2d
`self.padding: Union[int, List[int]]` was wrong
* fix that
2024-10-29 11:27:46 -04:00
qazal
7bd79f4922
pass viz render errors ( #7369 )
...
* pass viz render errors
* pcall
2024-10-29 22:48:27 +08:00
qazal
51c0c8d27e
cachable small graph rewrite ( #7371 )
2024-10-29 22:28:13 +08:00
chenyu
9b81931a36
make_pair -> make_tuple [pr] ( #7372 )
...
it's used more often as generic tuple, also removed the default 2.
2024-10-29 10:27:39 -04:00
qazal
d803a9c7c8
global metadata try 2 ( #7367 )
2024-10-29 20:21:00 +08:00
George Hotz
2cfc7b6695
Index everywhere 2 ( #7363 )
...
* indexing everywhere [pr]
* fix tests
2024-10-29 19:29:40 +08:00
qazal
7149eabb34
assert set equality in TestTensorMetadata [pr] ( #7364 )
2024-10-29 19:29:29 +08:00
qazal
0ebdb136e8
revert metadata with graph_rewrite ( #7353 ) ( #7362 )
...
This reverts commit 540e4179e7 .
2024-10-29 19:16:31 +08:00
qazal
f2044cfb22
hotfix: if getenv("RUN_PROCESS_REPLAY") ( #7361 )
2024-10-29 18:51:29 +08:00
George Hotz
0af1212164
use assertEqual with new style uops [pr] ( #7360 )
2024-10-29 18:43:21 +08:00
George Hotz
0beb2d8f84
ptx indexing ( #7359 )
...
* ptx indexing
* shorter
* fix load/store
2024-10-29 18:29:44 +08:00
George Hotz
572499c71a
add indexing to ops_python ( #7358 )
...
* add indexing to ops_python
* fix image
2024-10-29 18:11:03 +08:00
qazal
540e4179e7
global UOp to Metadata mapping + inverse DEBUG=2 metadata order [pr] ( #7353 )
...
* add ctx.buf_metadata [pr]
* revert metadata insertion order
* lint rename
2024-10-29 17:12:00 +08:00
George Hotz
2fdfcffe4c
improve ci speed [pr] ( #7357 )
2024-10-29 17:00:35 +08:00
qazal
8fab7b21df
everything is load ( #7355 )
...
* everything is load
* rename to ops
2024-10-29 16:47:33 +08:00
George Hotz
b647fa7514
rename MathTraits to maximum [pr] ( #7356 )
2024-10-29 16:43:04 +08:00
George Hotz
2bf55d8eda
make ops more like tensor [pr] ( #7352 )
...
* make ops more like tensor [pr]
* tensor is simple math trait
* no shifts
2024-10-29 16:23:41 +08:00
George Hotz
3989bd2682
idiv + reciprocal [pr] ( #7354 )
...
* idiv + reciprocal
* remove upcast from div
* fix docs
2024-10-29 15:54:19 +08:00
Bhavya Gada
3419ae282d
VIZ UI improvement: generic function for vscode opener ( #7338 )
...
* generic function for vscode opener
* eslint
* shorter
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-10-29 15:02:04 +08:00
qazal
c03e1693fc
shorter gate folding [pr] ( #7350 )
2024-10-29 14:49:32 +08:00
George Hotz
3e8225299c
ext gate indexing ( #7349 )
...
* ext gate indexing
* copy paste better
2024-10-29 14:46:10 +08:00
Bhavya Gada
13ea4979d5
VIZ UI improvement: autoscroll kernel list when using arrow buttons ( #7344 )
2024-10-29 14:40:42 +08:00
George Hotz
d9d4dd6756
faster ci [pr] ( #7348 )
2024-10-29 14:01:44 +08:00
George Hotz
a5e0f59e41
move autogen to different CI runner [pr] ( #7346 )
...
* move autogen to different CI runner [pr]
* balance a bit
* readme back there
* compile enet in autogen
2024-10-29 13:35:22 +08:00
George Hotz
4cb236a495
index in cstyle ( #7328 )
...
* index only in cstyle
* fix prefix dtypes
* fix tests
* global indexing
* Revert "global indexing"
This reverts commit 4d507e8abb .
* fix image
* fix image
* ptx tests
* fix CUDA dtype rendering
2024-10-29 13:06:26 +08:00
George Hotz
f55c3dcff8
hotfix: bump ocelot
2024-10-29 12:46:24 +08:00
George Hotz
4fe1945df6
llvm if load ( #7345 )
...
* llvm if load
* unneeded line
* local llvm CI
2024-10-29 11:33:22 +08:00
chenyu
8625dd4eea
minor changes reading ops.py ( #7343 )
2024-10-28 19:04:12 -04:00
chenyu
6021bf87f4
unify T = TypeVar("T") ( #7342 )
2024-10-28 18:43:44 -04:00
chenyu
293adc141a
clean up get_shape [pr] ( #7341 )
...
* clean up get_shape [pr]
aapi is literal false
* more
2024-10-28 18:25:37 -04:00
chenyu
c398f2467c
test uop mul min/max do not have nan in 0*inf ( #7340 )
2024-10-28 17:52:01 -04:00
chenyu
0843734927
clean up nan handling in transcendental ( #7332 )
...
* clean up nan handling in transcendental
* skip remu crash
2024-10-28 16:21:49 -04:00
Sieds Lykles
75dcd98e79
Fix calculation of vmin and vmax in multiplication when one src is negative and the other src has negative min and positive max ( #7333 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-28 16:01:46 -04:00
chenyu
603fcc96f2
limit UOps.ALU min/max to non-float only ( #7336 )
...
does this impact anything? some inf is incorrect now
2024-10-28 15:34:19 -04:00
ignaciosica
32fa297e6c
cleaner nan rendering ( #7337 )
2024-10-28 14:36:36 -04:00
qazal
00362a117c
scheduler bfs renames [pr] ( #7335 )
2024-10-29 00:24:23 +08:00
qazal
d8820644e0
split preschedule from ast rewrite [pr] ( #7334 )
2024-10-28 17:45:09 +02:00
chenyu
6b0e8cb04f
remove float_to_bits in transcendental [pr] ( #7331 )
...
it's just bitcast, and removed the weird bits_to_float indirection
2024-10-28 10:20:19 -04:00
qazal
b9b28e6883
viz stuff [pr] ( #7330 )
...
* viz stuff [pr]
* button
2024-10-28 21:46:18 +08:00