George Hotz
ed194a1d3b
zero fold ( #1748 )
...
* add constant fold
* err, it's just zero folding
* self store fold + caching
* prints and more folds
* simpler winograd kernels
* remove childless uops
2023-09-03 13:48:11 -07:00
George Hotz
e17b1af160
UnaryOps.NEG ( #1749 )
2023-09-03 12:44:26 -07:00
George Hotz
e910e0e62c
folding mul by 0 ( #1743 )
...
* why doesn't this work
* zero mlop
* explicit fold in winograd
2023-09-03 09:04:12 -07:00
David Hou
3151d91f6e
3x3 winograd convs ( #1675 )
...
* winograd
* simplify local groups code
* comment
* respects self.opts.has_local
* always simplify ones
* make mypy happy
* move reshape, WINO flag
* wino flag, simple forward backward test for wino
* extra wino test
* merge oops
* comments
* axis_needs_valid -> axis_is_masked
* don't delete needs_valid (it's unused though)
* make linter happy
* make linter happy
* smaller test
* change number
* make wino tests very small
2023-09-03 07:29:43 -07:00
geohotstan
e36148b1ce
Make __getitem__ TINYer ( #1661 )
2023-09-02 23:01:01 -04:00
George Hotz
458eb89463
minor changes from prerender ( #1734 )
2023-09-01 10:04:47 -07:00
Roelof van Dijk
62536d6000
perf: use enumerate where possible ( #1692 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-30 10:41:51 -07:00
Roelof van Dijk
50f669e43b
[ready] perf: simpler Tensor init ( #1679 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-27 22:18:03 -04:00
chenyu
66fbf4800b
fix symbolic_ops tests with Tensor.training=True ( #1686 )
2023-08-26 23:19:56 -04:00
Jordan Wright
25be7f745d
Tensor.uniform with dtype=int bug fix ( #1593 )
2023-08-26 01:59:53 -04:00
Roelof van Dijk
02e64da678
refactor: tuples can be concatenated with + ( #1671 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-25 12:37:13 -04:00
George Hotz
1b8c40234f
Uast start ( #1650 )
...
* work
* more tests
* more tests 2
* don't break it
2023-08-23 12:00:06 -07:00
geohotstan
484708da87
#1615 fix ( #1616 )
2023-08-23 14:51:05 -04:00
George Hotz
a6d842af7a
move device to ops ( #1646 )
...
* move device to ops
* mlops types
* 2 lines
2023-08-23 08:30:17 -07:00
nimlgen
a65ae1198b
do replace div->mul for non-floats ( #1644 )
2023-08-23 07:34:31 -07:00
George Hotz
da694d4241
move that image import
2023-08-22 21:30:55 -07:00
George Hotz
41e83be3dd
simple where broadcast ( #1643 )
2023-08-22 21:24:49 -07:00
George Hotz
c831218139
Optional: Reduce line count and simplify the LazyBuffer interface ( #1642 )
...
* less lines in lazybuffer, def e
* custom function
* cast
* reorder functions
* lb type
2023-08-22 21:01:10 -07:00
c143
c9c40bb16f
Import whole math module in tensor.py ( #1628 )
2023-08-22 17:07:46 -04:00
Roelof van Dijk
6fcfa50b35
[ready] perf: no noop cast just to make mypy happy ( #1626 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-22 17:07:22 -04:00
George Hotz
de1fcc418f
no more toCPU path ( #1624 )
2023-08-22 11:07:26 -07:00
Umut Zengin
1e93fd5449
Readability for unreadable functions ( #1610 )
...
* cleaned
* typing
* typing
* if format
* if format
* mypy
* update argmax
* argmax more readable
* More stable def pad
* lint
2023-08-22 07:09:08 -07:00
Roelof van Dijk
b02f77b354
perf: faster broadcasted ( #1601 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-21 14:21:46 -07:00
Yixiang Gao
4d54afb6df
sparse cat cross entropy ( #1597 )
...
* add sparse cat cross entropy
* minor fix
* add log_softmax into loss function
* add test
* update docs
* fix training loss
* add device
2023-08-21 14:14:54 -07:00
Roelof van Dijk
109100656f
refactor: no len if it is not needed ( #1598 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com >
2023-08-21 14:06:32 -07:00
George Hotz
2e60920317
Revert "sparse cat cross entropy ( #1591 )" ( #1596 )
...
This reverts commit f0ee850e98 .
2023-08-21 10:04:26 -07:00
Yixiang Gao
f0ee850e98
sparse cat cross entropy ( #1591 )
...
* add sparse cat cross entropy
* minor fix
* add log_softmax into loss function
* add test
* update docs
2023-08-21 09:56:41 -07:00
Umut Zengin
35bf21276f
Argmax/Argmin Feature ( #1576 )
...
* implemented argmax and argmin
* lint
* lint
* match torch behaviour
* format
* removed flip
2023-08-20 18:46:46 -07:00
Umut Zengin
3fc7e984f0
__getitem__ refactoring ( #1586 )
...
* dene
* dene
* form
* form
* form
* form
* lint
* small change
* preserve old
* revert to explicit reshape
2023-08-20 18:42:30 -07:00
George Hotz
d627349af0
teeny changes ( #1589 )
...
* teeny changes
* import order
2023-08-20 13:38:38 -07:00
George Hotz
739f327d2d
Shorter ( #1582 )
...
* deleting lines
* remove insert dims
* if statement is never hit
* bug fixes
2023-08-20 08:12:16 -07:00
geohotstan
a293c18d34
Gather bugfix ( #1561 )
2023-08-16 19:53:14 -04:00
geohotstan
8763037f0e
Fancy indexing is fancy wow and gather thing ( #1399 )
2023-08-16 18:35:49 -04:00
YiMing Han
e00acb1eaf
fix deepwalk ctx check ( #1536 )
2023-08-13 23:03:17 -07:00
Jacky Lee
ef5f648e2f
Tensor.scaled_dot_product_attention to match torch, used in LLaMA, and tested ( #1502 )
...
* Implement scaled_dot_product_attention and test
* Support attn_mask
* Support is_causal too
* Use in llama
* Don't forget to reshape
* Set requires_grad=False for causal
* Remove staticmethod
* Remove extra spaces
2023-08-08 23:27:13 -07:00
George Hotz
d24f936501
just cmplt ( #1493 )
...
* just cmplt
* fix maximum
* don't save, there's no backward
* ugh, no slot either
* eq is a scam
2023-08-08 13:58:10 -07:00
George Hotz
d67e248d9b
simple bitcast 2 ( #1445 )
...
* simple bitcast 2
* bc 2
* empty
* Revert "empty"
This reverts commit d8ee083655 .
2023-08-06 00:30:50 -07:00
Francesco Castelli
579f4615a0
Add assert for wrong matmul/dot shapes ( #1438 )
2023-08-04 18:16:56 -04:00
Umut Zengin
52db7d7435
inf, -inf support for pad ( #1436 )
2023-08-04 15:05:25 -04:00
Umut Zengin
8889821547
Const pad support to pad2d and slice ( #1392 )
...
* slice to pad2d migrate
* Gain line
* Mypy happy
* Mypy happy
* Revert
* whitespace
2023-08-02 08:58:52 -07:00
Umut Zengin
0de5f20970
Re-open constant pad support to Tensor.pad ( #1388 )
...
* Added const padding support to .pad
* Linter
2023-07-31 17:08:57 -07:00
JaSpa99
5ab12059da
rng hlops: add normal and kaiming_normal ( #1378 )
...
* add normal and kaiming_normal
* make sure its float
* add tests
2023-07-31 10:37:02 -07:00
wozeparrot
32d1afa4b5
feat: correct case when base is 0 ( #1360 )
2023-07-27 13:53:38 -04:00
wozeparrot
c22e77abfd
Match torch on fractional negative base pow ( #1352 )
...
* feat: match torch on fractional negative base pow
* feat: tests for trunc
2023-07-26 19:14:54 -07:00
Umut Zengin
d4ebadf2da
Small Tensor.cat optimization and reformating ( #1347 )
2023-07-26 18:01:12 -04:00
geohotstan
4056f97187
Gather ( #1329 )
2023-07-25 15:05:41 -04:00
waifairer
d89fb729e5
flake8 ( #1323 )
...
* flake8: Ignore frequent violations, correct infrequent ones
* Ignore some rules in test
* Reorder test ignores
* Lint test + main
* EOF indent
* Include all E71,E72 errors
* Test the failing case in CI
* Revert "Test the failing case in CI"
This reverts commit 110add0a70 .
* Push to test!
This reverts commit f317532779 .
* ok back to passing
This reverts commit ba5052685f .
* Prove that CI fails when formatting is incorrect.
* Fix formatting
* Remove duplicitous E117 rule
* Use flake8 config for precommit
---------
Co-authored-by: waifairer <waifairer@gmail.com >
2023-07-24 11:19:58 -04:00
George Hotz
086382b64e
Revert "Fix max nan ( #1298 )" ( #1334 )
...
This reverts commit 50774470b2 .
2023-07-23 20:41:28 -07:00
uncommonSensor
50774470b2
Fix max nan ( #1298 )
...
* Fix max nan
* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests
* Fix max nan
* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests
* Turned off due to the need for granularity
2023-07-23 19:39:44 -07:00
madt2709
d2c1e8409a
Update arange to be (start, stop, step) ( #1308 )
2023-07-21 00:27:23 -04:00