chenyu
f1bf916b8a
apply NOOPT in test_arange complexity ( #4774 )
...
with hcopt, arange(2560) uses less ops than arange(256)
2024-05-29 23:12:35 -04:00
chenyu
cde7a7cda7
isolate the 134ms kernel in train_gpt2.py ( #4773 )
...
133ms on tinybox red with BEAM=2
2024-05-29 17:26:24 -04:00
nimlgen
57204c4014
amd cleanup pm4 queue ( #4772 )
2024-05-29 22:59:06 +03:00
lopusz
b2c408912c
Add docs link to README ( #4768 )
2024-05-29 17:47:47 +00:00
chenyu
f2414c666f
fix train_gpt2.py ( #4771 )
...
added `with Tensor.train():`
2024-05-29 12:01:34 -04:00
chenyu
59c6472b9f
check contiguous in View.create after canonicalizing mask and offset ( #4770 )
...
mask / offset / strides can change during canonicalization, and contiguous can be True at the end
2024-05-29 11:31:13 -04:00
qazal
6e5fa5fd92
map local aliases to reduceop ( #4766 )
...
* map
* ugh
* save one line
* concerning, does this pass
* Revert "concerning, does this pass"
This reverts commit 64d4664f17 .
* use local_alias
2024-05-28 21:11:25 -04:00
chenyu
7624ad3ddd
add --timing and --profile to llama3 example ( #4767 )
2024-05-28 16:24:44 -04:00
qazal
c235223c07
refactor tc_opt creation ( #4765 )
...
* move reduceop loop
* this is more mergable code
add assert
* integrate s2
2024-05-28 23:10:27 +03:00
qazal
a88aea626d
map tensor core bufs to reduceop ( #4763 )
...
* tc_opts.bufs to its only map
* lint
* iterate reduceop bufs
2024-05-28 22:07:39 +03:00
wozeparrot
6fcf220b21
feat: tag 0.9.0 ( #4762 )
v0.9.0
2024-05-28 18:44:45 +00:00
chenyu
e22cdb40f3
docs: fix mkdoc warnings and link to tensor.md ( #4760 )
2024-05-28 14:24:11 -04:00
nimlgen
872827b6ae
fix usage of args struct in hcq ( #4758 )
...
* do not allocate empty buffer in hcq
* do not take args struct from program
2024-05-28 21:10:39 +03:00
wozeparrot
b2b49cef6f
split tensor docs ( #4754 )
2024-05-28 11:03:52 -07:00
nimlgen
fe26d3fefe
nv sync before free for binded commands ( #4759 )
...
* nv sync before free for binded commands
* shorter comment
2024-05-28 20:49:29 +03:00
chenyu
e614b7c696
docs: showcase remove mnist_gan and add conversation.py ( #4757 )
...
fixed both examples, and i think it's better to show conversation
2024-05-28 11:09:26 -04:00
nimlgen
019f4680e5
check dims before execution on nv ( #4756 )
...
* check dims before execution on nv
* fix linter
2024-05-28 16:57:28 +03:00
qazal
0e824741c4
pre multi reduce codegen/* cleanup ( #4755 )
...
* refactor self.reduceop
* free lines
* fix test
2024-05-28 08:15:48 -04:00
chenyu
fd249422f5
minor cleanup example stable_diffusion ( #4753 )
2024-05-28 00:05:37 -04:00
chenyu
53b9081aab
check arg types of Tensor.randint ( #4751 )
...
raise TypeError if low, high, dtype are not ints
2024-05-27 20:24:10 -04:00
chenyu
16756af13c
docs: polish tensor.py ( #4750 )
...
* docs: polish tensor.py
* don't change that
2024-05-27 20:00:56 -04:00
Elias Wahl
c4b0acf095
Global norm + small changes ( #4749 )
...
* norm
* no empty
* default loss scaler in float
2024-05-27 18:35:27 -04:00
chenyu
c7beb36b73
docs: update page headers ( #4748 )
...
replaced the "Index" from Home page with "tinygrad documentation"
2024-05-27 18:04:14 -04:00
wozeparrot
0680b9cb60
expand sidebar by default ( #4747 )
2024-05-27 15:03:31 -07:00
chenyu
db0e19fbd7
docs: minor update to quickstart ( #4746 )
...
update import, point to docs more, remove mention that assign to index not supported
2024-05-27 17:46:36 -04:00
wozeparrot
4cb38a15a5
tweak docs style ( #4745 )
2024-05-27 14:32:09 -07:00
chenyu
5b323d77db
docs: change nav structure and add repo_url ( #4743 )
...
tensor / function / dtype / nn under API now
2024-05-27 15:45:54 -04:00
nimlgen
50e95b8212
nv qmd sync ( #4740 )
...
* qmd sync
* better hcq
* mockgpu support chain qmd
* fix mockgpu & linter
2024-05-27 18:51:30 +03:00
qazal
c69929ee25
integrate ALU uoping ( #4741 )
2024-05-27 23:05:24 +08:00
qazal
0e69b22629
multireduce OptOps tests (start) ( #4733 )
...
* start
* full tests
* add skips
* unrelated
* notes
2024-05-27 12:21:33 +03:00
chenyu
0b58203cbe
docs: fix nn rendering ( #4736 )
...
* docs: fix nn rendering
need to import nn at the start of the page. maybe there's a better way to move it on top?
* print that
2024-05-26 18:55:20 -04:00
wozeparrot
5c6af97436
nn docs ( #4735 )
2024-05-26 13:08:22 -07:00
qazal
c7b1d802f1
delete duplicate tests in test_linearizer ( #4723 )
...
* delete duplicate test
test_simplify_uop isnt needed
max works
* ci
* remove skip
* add skip back
2024-05-26 08:11:42 +03:00
nimlgen
c87b066b66
optimize nv sync ( #4729 )
...
* optimize nv sync
* sdma signal without wfi
* nv mockgou support
* sep change
2024-05-25 23:10:41 +03:00
chenyu
8415b14978
pow cleanup part 3 ( #4731 )
...
fast pow for int or (int+0.5) const exponent. and more comments
2024-05-25 15:48:52 -04:00
Szymon Ożóg
de5c69c4c9
Unify test_dtype naming conventions ( #4730 )
2024-05-25 10:12:40 -04:00
chenyu
7e90026eb0
pow cleanup part 2 ( #4727 )
...
more cleanups and fix 0 ** 0
2024-05-25 07:17:40 -04:00
chenyu
85e57223bd
pow cleanup part 1 ( #4726 )
...
use _broadcasted to convert 3 cases into 1. const simplification should be handled by const folding.
2024-05-25 03:24:10 -04:00
Szymon Ożóg
f7201b6852
Remove deprecated code ( #4724 )
2024-05-25 03:02:12 -04:00
wozeparrot
5f503226de
finish tensor docs ( #4722 )
2024-05-24 15:57:43 -07:00
chenyu
edf27470c1
docs: fix stack and add dtype.DType ( #4721 )
2024-05-24 18:23:01 -04:00
chenyu
a16d2572a0
docs: clean up mentions of mlops ( #4720 )
2024-05-24 17:49:32 -04:00
chenyu
31358cbea5
change Tensor.stack to method ( #4719 )
2024-05-24 17:04:19 -04:00
chenyu
ba116ff630
docs: fix mnist type and fixed seed and loss ( #4717 )
2024-05-24 16:18:30 -04:00
Szymon Ożóg
212025b53c
Int mulacc for ptx ( #4680 )
...
* IntMulacc
* don't mov const
* Dont do int mulacc on ocelot
* Workaround for ocelot
* Remove ocelot workaround
* Fix tests that merged into mulacc
* fix uop cout after mergin to mulacc
2024-05-24 15:20:48 -04:00
chenyu
0ac761716a
docs: logo and favicon ( #4716 )
2024-05-24 14:33:12 -04:00
Szymon Ożóg
a4de81e9a6
Update ocelot version ( #4715 )
2024-05-24 14:32:53 -04:00
chenyu
a894209bf7
docs: add ConstType to dtypes, limit function to its member ( #4714 )
2024-05-24 14:22:34 -04:00
chenyu
a41701ce71
docs: elementwise ops (broadcasted) and update examples ( #4713 )
...
* docs: elementwise ops (broadcasted) and update examples
* fix where
* space
2024-05-24 13:19:21 -04:00
qazal
c170ddceaf
fix commavq benchmark ( #4712 )
...
* fix _slice and assert explicit device
* with _slice
2024-05-24 19:40:57 +03:00