Commit Graph

456 Commits

Author SHA1 Message Date
George Hotz
ed194a1d3b zero fold (#1748)
* add constant fold

* err, it's just zero folding

* self store fold + caching

* prints and more folds

* simpler winograd kernels

* remove childless uops
2023-09-03 13:48:11 -07:00
George Hotz
e17b1af160 UnaryOps.NEG (#1749) 2023-09-03 12:44:26 -07:00
George Hotz
e910e0e62c folding mul by 0 (#1743)
* why doesn't this work

* zero mlop

* explicit fold in winograd
2023-09-03 09:04:12 -07:00
David Hou
3151d91f6e 3x3 winograd convs (#1675)
* winograd

* simplify local groups code

* comment

* respects self.opts.has_local

* always simplify ones

* make mypy happy

* move reshape, WINO flag

* wino flag, simple forward backward test for wino

* extra wino test

* merge oops

* comments

* axis_needs_valid -> axis_is_masked

* don't delete needs_valid (it's unused though)

* make linter happy

* make linter happy

* smaller test

* change number

* make wino tests very small
2023-09-03 07:29:43 -07:00
geohotstan
e36148b1ce Make __getitem__ TINYer (#1661) 2023-09-02 23:01:01 -04:00
George Hotz
458eb89463 minor changes from prerender (#1734) 2023-09-01 10:04:47 -07:00
Roelof van Dijk
62536d6000 perf: use enumerate where possible (#1692)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-30 10:41:51 -07:00
Roelof van Dijk
50f669e43b [ready] perf: simpler Tensor init (#1679)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-27 22:18:03 -04:00
chenyu
66fbf4800b fix symbolic_ops tests with Tensor.training=True (#1686) 2023-08-26 23:19:56 -04:00
Jordan Wright
25be7f745d Tensor.uniform with dtype=int bug fix (#1593) 2023-08-26 01:59:53 -04:00
Roelof van Dijk
02e64da678 refactor: tuples can be concatenated with + (#1671)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-25 12:37:13 -04:00
George Hotz
1b8c40234f Uast start (#1650)
* work

* more tests

* more tests 2

* don't break it
2023-08-23 12:00:06 -07:00
geohotstan
484708da87 #1615 fix (#1616) 2023-08-23 14:51:05 -04:00
George Hotz
a6d842af7a move device to ops (#1646)
* move device to ops

* mlops types

* 2 lines
2023-08-23 08:30:17 -07:00
nimlgen
a65ae1198b do replace div->mul for non-floats (#1644) 2023-08-23 07:34:31 -07:00
George Hotz
da694d4241 move that image import 2023-08-22 21:30:55 -07:00
George Hotz
41e83be3dd simple where broadcast (#1643) 2023-08-22 21:24:49 -07:00
George Hotz
c831218139 Optional: Reduce line count and simplify the LazyBuffer interface (#1642)
* less lines in lazybuffer, def e

* custom function

* cast

* reorder functions

* lb type
2023-08-22 21:01:10 -07:00
c143
c9c40bb16f Import whole math module in tensor.py (#1628) 2023-08-22 17:07:46 -04:00
Roelof van Dijk
6fcfa50b35 [ready] perf: no noop cast just to make mypy happy (#1626)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-22 17:07:22 -04:00
George Hotz
de1fcc418f no more toCPU path (#1624) 2023-08-22 11:07:26 -07:00
Umut Zengin
1e93fd5449 Readability for unreadable functions (#1610)
* cleaned

* typing

* typing

* if format

* if format

* mypy

* update argmax

* argmax more readable

* More stable def pad

* lint
2023-08-22 07:09:08 -07:00
Roelof van Dijk
b02f77b354 perf: faster broadcasted (#1601)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-21 14:21:46 -07:00
Yixiang Gao
4d54afb6df sparse cat cross entropy (#1597)
* add sparse cat cross entropy

* minor fix

* add log_softmax into loss function

* add test

* update docs

* fix training loss

* add device
2023-08-21 14:14:54 -07:00
Roelof van Dijk
109100656f refactor: no len if it is not needed (#1598)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-21 14:06:32 -07:00
George Hotz
2e60920317 Revert "sparse cat cross entropy (#1591)" (#1596)
This reverts commit f0ee850e98.
2023-08-21 10:04:26 -07:00
Yixiang Gao
f0ee850e98 sparse cat cross entropy (#1591)
* add sparse cat cross entropy

* minor fix

* add log_softmax into loss function

* add test

* update docs
2023-08-21 09:56:41 -07:00
Umut Zengin
35bf21276f Argmax/Argmin Feature (#1576)
* implemented argmax and argmin

* lint

* lint

* match torch behaviour

* format

* removed flip
2023-08-20 18:46:46 -07:00
Umut Zengin
3fc7e984f0 __getitem__ refactoring (#1586)
* dene

* dene

* form

* form

* form

* form

* lint

* small change

* preserve old

* revert to explicit reshape
2023-08-20 18:42:30 -07:00
George Hotz
d627349af0 teeny changes (#1589)
* teeny changes

* import order
2023-08-20 13:38:38 -07:00
George Hotz
739f327d2d Shorter (#1582)
* deleting lines

* remove insert dims

* if statement is never hit

* bug fixes
2023-08-20 08:12:16 -07:00
geohotstan
a293c18d34 Gather bugfix (#1561) 2023-08-16 19:53:14 -04:00
geohotstan
8763037f0e Fancy indexing is fancy wow and gather thing (#1399) 2023-08-16 18:35:49 -04:00
YiMing Han
e00acb1eaf fix deepwalk ctx check (#1536) 2023-08-13 23:03:17 -07:00
Jacky Lee
ef5f648e2f Tensor.scaled_dot_product_attention to match torch, used in LLaMA, and tested (#1502)
* Implement scaled_dot_product_attention and test

* Support attn_mask

* Support is_causal too

* Use in llama

* Don't forget to reshape

* Set requires_grad=False for causal

* Remove staticmethod

* Remove extra spaces
2023-08-08 23:27:13 -07:00
George Hotz
d24f936501 just cmplt (#1493)
* just cmplt

* fix maximum

* don't save, there's no backward

* ugh, no slot either

* eq is a scam
2023-08-08 13:58:10 -07:00
George Hotz
d67e248d9b simple bitcast 2 (#1445)
* simple bitcast 2

* bc 2

* empty

* Revert "empty"

This reverts commit d8ee083655.
2023-08-06 00:30:50 -07:00
Francesco Castelli
579f4615a0 Add assert for wrong matmul/dot shapes (#1438) 2023-08-04 18:16:56 -04:00
Umut Zengin
52db7d7435 inf, -inf support for pad (#1436) 2023-08-04 15:05:25 -04:00
Umut Zengin
8889821547 Const pad support to pad2d and slice (#1392)
* slice to pad2d migrate

* Gain line

* Mypy happy

* Mypy happy

* Revert

* whitespace
2023-08-02 08:58:52 -07:00
Umut Zengin
0de5f20970 Re-open constant pad support to Tensor.pad (#1388)
* Added const padding support to .pad

* Linter
2023-07-31 17:08:57 -07:00
JaSpa99
5ab12059da rng hlops: add normal and kaiming_normal (#1378)
* add normal and kaiming_normal

* make sure its float

* add tests
2023-07-31 10:37:02 -07:00
wozeparrot
32d1afa4b5 feat: correct case when base is 0 (#1360) 2023-07-27 13:53:38 -04:00
wozeparrot
c22e77abfd Match torch on fractional negative base pow (#1352)
* feat: match torch on fractional negative base pow

* feat: tests for trunc
2023-07-26 19:14:54 -07:00
Umut Zengin
d4ebadf2da Small Tensor.cat optimization and reformating (#1347) 2023-07-26 18:01:12 -04:00
geohotstan
4056f97187 Gather (#1329) 2023-07-25 15:05:41 -04:00
waifairer
d89fb729e5 flake8 (#1323)
* flake8: Ignore frequent violations, correct infrequent ones

* Ignore some rules in test

* Reorder test ignores

* Lint test + main

* EOF indent

* Include all E71,E72 errors

* Test the failing case in CI

* Revert "Test the failing case in CI"

This reverts commit 110add0a70.

* Push to test!
This reverts commit f317532779.

* ok back to passing
This reverts commit ba5052685f.

* Prove that CI fails when formatting is incorrect.

* Fix formatting

* Remove duplicitous E117 rule

* Use flake8 config for precommit

---------

Co-authored-by: waifairer <waifairer@gmail.com>
2023-07-24 11:19:58 -04:00
George Hotz
086382b64e Revert "Fix max nan (#1298)" (#1334)
This reverts commit 50774470b2.
2023-07-23 20:41:28 -07:00
uncommonSensor
50774470b2 Fix max nan (#1298)
* Fix max nan

* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests

* Fix max nan

* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests
* Turned off due to the need for granularity
2023-07-23 19:39:44 -07:00
madt2709
d2c1e8409a Update arange to be (start, stop, step) (#1308) 2023-07-21 00:27:23 -04:00