DavidFarago
|
1ba8f0dca3
|
Quickstart: Upgrade section "Training" to new code (#1663)
Co-authored-by: Dave Farago <dfarago@innoopract.com>
|
2023-08-24 17:12:16 -04:00 |
|
DavidFarago
|
29adae84eb
|
Quickstart: Use tensors to compute train accuracy (#1662)
Co-authored-by: Dave Farago <dfarago@innoopract.com>
|
2023-08-24 17:09:12 -04:00 |
|
George Hotz
|
d37d092c14
|
split linearizer into 3 files (#1654)
|
2023-08-23 14:58:47 -07:00 |
|
George Hotz
|
1b8c40234f
|
Uast start (#1650)
* work
* more tests
* more tests 2
* don't break it
|
2023-08-23 12:00:06 -07:00 |
|
geohotstan
|
484708da87
|
#1615 fix (#1616)
|
2023-08-23 14:51:05 -04:00 |
|
Pavol Rusnak
|
b57c374164
|
add accelerator links to readme (#1649)
|
2023-08-23 14:47:55 -04:00 |
|
George Hotz
|
82623697a8
|
Move asm renderer (#1648)
* teeny changes
* teeny updates
* move to renderer
|
2023-08-23 10:06:43 -07:00 |
|
George Hotz
|
a89363574d
|
teeny changes (#1647)
* teeny changes
* teeny updates
|
2023-08-23 09:53:39 -07:00 |
|
George Hotz
|
a6d842af7a
|
move device to ops (#1646)
* move device to ops
* mlops types
* 2 lines
|
2023-08-23 08:30:17 -07:00 |
|
nimlgen
|
a65ae1198b
|
do replace div->mul for non-floats (#1644)
|
2023-08-23 07:34:31 -07:00 |
|
George Hotz
|
da694d4241
|
move that image import
|
2023-08-22 21:30:55 -07:00 |
|
George Hotz
|
41e83be3dd
|
simple where broadcast (#1643)
|
2023-08-22 21:24:49 -07:00 |
|
George Hotz
|
c831218139
|
Optional: Reduce line count and simplify the LazyBuffer interface (#1642)
* less lines in lazybuffer, def e
* custom function
* cast
* reorder functions
* lb type
|
2023-08-22 21:01:10 -07:00 |
|
George Hotz
|
d25046e66a
|
matvec tests (#1634)
* matvec tests
* f16
* f16 is broken
|
2023-08-22 17:33:58 -07:00 |
|
George Hotz
|
643cbdfd50
|
make embedding and GPT-2 fast (#1631)
* make embedding fast
* jit more, variable shape support
* print mem bw
|
2023-08-22 15:14:38 -07:00 |
|
Niklas D
|
a7752ad65d
|
Fix link to state.py in quickstart (#1632)
|
2023-08-22 17:39:30 -04:00 |
|
c143
|
c9c40bb16f
|
Import whole math module in tensor.py (#1628)
|
2023-08-22 17:07:46 -04:00 |
|
Roelof van Dijk
|
6fcfa50b35
|
[ready] perf: no noop cast just to make mypy happy (#1626)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-22 17:07:22 -04:00 |
|
Roelof van Dijk
|
f04a6d7882
|
perf: faster partition (#1625)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-22 11:56:41 -07:00 |
|
George Hotz
|
d3c401ba3c
|
llama quantize: scale uses mul, not div
|
2023-08-22 11:48:56 -07:00 |
|
George Hotz
|
696e4d20a1
|
fix KOPT=2 with variable shape
|
2023-08-22 11:34:34 -07:00 |
|
George Hotz
|
de1fcc418f
|
no more toCPU path (#1624)
|
2023-08-22 11:07:26 -07:00 |
|
George Hotz
|
463dece63e
|
auto arg dtypes (#1623)
|
2023-08-22 10:22:40 -07:00 |
|
George Hotz
|
db8344ab83
|
add noalias to llvm (#1622)
|
2023-08-22 09:26:01 -07:00 |
|
chenyu
|
89e13f2f04
|
support symbols in shrink (#1611)
|
2023-08-22 09:08:21 -07:00 |
|
George Hotz
|
718ced296c
|
move state to nn/state (#1619)
|
2023-08-22 07:36:24 -07:00 |
|
Umut Zengin
|
1e93fd5449
|
Readability for unreadable functions (#1610)
* cleaned
* typing
* typing
* if format
* if format
* mypy
* update argmax
* argmax more readable
* More stable def pad
* lint
|
2023-08-22 07:09:08 -07:00 |
|
George Hotz
|
86a32ffb1a
|
lt sum (#1617)
|
2023-08-21 21:19:16 -07:00 |
|
George Hotz
|
c64c47a6ae
|
test arange simple
|
2023-08-21 20:16:17 -07:00 |
|
George Hotz
|
4f459841bc
|
Symbolic JIT for GPT2 (#1613)
* not fast yet
* simpler
* symbolic jit
* fp16 GOPS and GB
|
2023-08-21 19:44:57 -07:00 |
|
Yixiang Gao
|
4f02491cd4
|
add cpu if torch tensor (#1609)
|
2023-08-21 16:57:59 -07:00 |
|
Umut Zengin
|
f720682beb
|
np.argmax to Tensor.argmax (#1608)
* to tensor argmax
* removed keepdim
* training update
|
2023-08-21 15:22:29 -07:00 |
|
George Hotz
|
4ea00bad38
|
track down llama bug
|
2023-08-21 15:14:21 -07:00 |
|
Roelof van Dijk
|
b02f77b354
|
perf: faster broadcasted (#1601)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-21 14:21:46 -07:00 |
|
Yixiang Gao
|
4d54afb6df
|
sparse cat cross entropy (#1597)
* add sparse cat cross entropy
* minor fix
* add log_softmax into loss function
* add test
* update docs
* fix training loss
* add device
|
2023-08-21 14:14:54 -07:00 |
|
Roelof van Dijk
|
109100656f
|
refactor: no len if it is not needed (#1598)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-21 14:06:32 -07:00 |
|
Roelof van Dijk
|
2c8f8ac611
|
perf: no ret needed (#1604)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-21 14:05:13 -07:00 |
|
Roelof van Dijk
|
750714c386
|
perf: namedtuples are hashable, don't need a key (#1607)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-21 14:01:10 -07:00 |
|
George Hotz
|
aaa6fdf347
|
this was unused code (#1600)
|
2023-08-21 12:02:58 -07:00 |
|
Roelof van Dijk
|
8e8724d3a8
|
perf: if argument order (mops) (#1599)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-21 11:20:39 -07:00 |
|
George Hotz
|
2e60920317
|
Revert "sparse cat cross entropy (#1591)" (#1596)
This reverts commit f0ee850e98.
|
2023-08-21 10:04:26 -07:00 |
|
Yixiang Gao
|
f0ee850e98
|
sparse cat cross entropy (#1591)
* add sparse cat cross entropy
* minor fix
* add log_softmax into loss function
* add test
* update docs
|
2023-08-21 09:56:41 -07:00 |
|
Yixiang Gao
|
8d6662a741
|
.cpu().numpy() -> .numpy() (#1594)
* .cpu().numpy() -> .numpy()
* restore ops_torch
* restore test_speed_v_torch
|
2023-08-21 09:53:29 -07:00 |
|
Umut Zengin
|
35bf21276f
|
Argmax/Argmin Feature (#1576)
* implemented argmax and argmin
* lint
* lint
* match torch behaviour
* format
* removed flip
|
2023-08-20 18:46:46 -07:00 |
|
Roelof van Dijk
|
1900acda09
|
[READY] ci: setup venv cache (#1475)
* ci: cache installed packages
* ci: trigger jobs
* ci: fix hashfiles argument
---------
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
|
2023-08-20 18:43:16 -07:00 |
|
Umut Zengin
|
3fc7e984f0
|
__getitem__ refactoring (#1586)
* dene
* dene
* form
* form
* form
* form
* lint
* small change
* preserve old
* revert to explicit reshape
|
2023-08-20 18:42:30 -07:00 |
|
George Hotz
|
d627349af0
|
teeny changes (#1589)
* teeny changes
* import order
|
2023-08-20 13:38:38 -07:00 |
|
George Hotz
|
012ee7d162
|
not worth the speed (#1584)
* not worth the speed
* no slots
* uops comments
* bump to python 3.11 for speed
* add critical slots back
|
2023-08-20 10:24:58 -07:00 |
|
George Hotz
|
739f327d2d
|
Shorter (#1582)
* deleting lines
* remove insert dims
* if statement is never hit
* bug fixes
|
2023-08-20 08:12:16 -07:00 |
|
David Hou
|
4fbce972d7
|
CSE at uop level (#1483)
* uop-level cse
* add test
* don't cache reduce alu ops
* types
* rename variable
* fix
* delete lines
|
2023-08-19 23:40:40 -07:00 |
|