Commit Graph

4667 Commits

Author SHA1 Message Date
wozeparrot
801bed4f66 Add ops_shm (#1413)
* feat: add ops_shm

* clean: extra newline

* feat: add test

* feat: ci doesn't like that

* feat: ci still doesn't like that

* feat: skip big test on ci

* feat: testing

* feat: big

* feat: testing again

* feat: reskip test
2023-08-03 17:40:52 -07:00
chenyu
34f348643b Support constant expand to symbolic shape (#1411) 2023-08-02 21:21:22 -07:00
chenyu
6572ca6835 support symbolic expand (#1407) 2023-08-02 20:03:46 -04:00
chenyu
18d0a93f09 LazyBuffer.get_variable_buffers() (#1391)
* LazyBudder.get_variable_buffers()

* remove left_only, add ProdNode

* no vars for OpNode.b

* do not change symbolic vars, remove ProdNode
2023-08-02 09:01:35 -07:00
Umut Zengin
8889821547 Const pad support to pad2d and slice (#1392)
* slice to pad2d migrate

* Gain line

* Mypy happy

* Mypy happy

* Revert

* whitespace
2023-08-02 08:58:52 -07:00
Alex Telon
b66361843a Timing and Context can now be used as decorators (#1385)
* Context and Timing can now be used as decorators

* Using Timing decorator in quickstart.md

The time formating is better and is a useful tool to learn.

Old: Time: 3.5260659999912605
New: Time: 3526.14 ms

* Updated env_vars documentation for Context

* Added test for Context decorator

* Put new import on same line as others
2023-08-01 17:16:10 -07:00
Diogo
4dc8595069 simple exporting models (#1344)
* unified exporting

* json exporting

* ignore more

* simplified buffer export

* added dtypes

* added assert

* swift example

* fix tests

* linter

* remove whitespace

* fixed tests

* remove swift example

* remove unintended changes

* allow callable models to be used

* whitespace

* more readable json export

* name change

* whitespace

* whitespace
2023-08-01 09:35:48 -07:00
Diogo
ba5e3818a0 Limit dims based on max size (#1390)
* working

* whitespace

* changed defaults to None

* linter

* last linter error
2023-07-31 19:18:19 -07:00
chenyu
b2fde9ec36 reshape to register variable value (#1386)
* reshape to register variable value

* better error message
2023-07-31 17:10:02 -07:00
Umut Zengin
0de5f20970 Re-open constant pad support to Tensor.pad (#1388)
* Added const padding support to .pad

* Linter
2023-07-31 17:08:57 -07:00
Alex Telon
2d10e0340e Refactored ContextVars (#1331) 2023-07-31 15:44:46 -04:00
chenyu
f5ef445cb6 trim space (#1381) 2023-07-31 10:37:57 -07:00
JaSpa99
5ab12059da rng hlops: add normal and kaiming_normal (#1378)
* add normal and kaiming_normal

* make sure its float

* add tests
2023-07-31 10:37:02 -07:00
George Hotz
37fa7e96fb Revert "update editorconfig, enforce via CI (#1343)" (#1380)
This reverts commit da2efecbe2.
2023-07-31 10:35:50 -07:00
Pavol Rusnak
da2efecbe2 update editorconfig, enforce via CI (#1343)
* update editorconfig to set unix-style newlines and trim whitespace

* add editorconfig github action to the CI

* fix whitespace
2023-07-30 18:44:30 -07:00
S-Lykles
c2b82ea8ac fix to_shape_strides (#1374)
* add tests for expr_node and expr_idxs

* simplify condition and add missing optimization
2023-07-30 18:42:46 -07:00
chenyu
1fdf560fb1 simplify get_contraction (#1373) 2023-07-30 18:35:22 -07:00
S-Lykles
a32c677601 Fix off by one error in View.expr_node (#1363)
* Fix off_by_one error in View.expr_node

* Add test for expr_node

* Remove whitespace before :

* test no arguments and properly test idx=None
2023-07-29 08:10:37 -07:00
Karan Handa
e0a69bdbe6 Fix argfix and add tests (#1365)
* Remove unreachable code

* Fixed argfix

* Add empty check and tests

* Removed redundant tests"
2023-07-28 09:09:49 -07:00
wozeparrot
32d1afa4b5 feat: correct case when base is 0 (#1360) 2023-07-27 13:53:38 -04:00
wozeparrot
c22e77abfd Match torch on fractional negative base pow (#1352)
* feat: match torch on fractional negative base pow

* feat: tests for trunc
2023-07-26 19:14:54 -07:00
Umut Zengin
d4ebadf2da Small Tensor.cat optimization and reformating (#1347) 2023-07-26 18:01:12 -04:00
geohotstan
4056f97187 Gather (#1329) 2023-07-25 15:05:41 -04:00
Francis Lam
9d142430cb Add option in llama.py to quantize weights to int8 at runtime (#1289)
* Add option in llama.py to quantize weights to int8 at runtime

Also added lm-eval to external

* Add support for llama-2 evaluation
2023-07-24 17:22:38 -07:00
Pavol Rusnak
cd60b8561c Add LLaMA-2 support (#1284)
Co-authored-by: wozeparrot <wozeparrot@gmail.com>
2023-07-24 17:12:02 -04:00
waifairer
d89fb729e5 flake8 (#1323)
* flake8: Ignore frequent violations, correct infrequent ones

* Ignore some rules in test

* Reorder test ignores

* Lint test + main

* EOF indent

* Include all E71,E72 errors

* Test the failing case in CI

* Revert "Test the failing case in CI"

This reverts commit 110add0a70.

* Push to test!
This reverts commit f317532779.

* ok back to passing
This reverts commit ba5052685f.

* Prove that CI fails when formatting is incorrect.

* Fix formatting

* Remove duplicitous E117 rule

* Use flake8 config for precommit

---------

Co-authored-by: waifairer <waifairer@gmail.com>
2023-07-24 11:19:58 -04:00
George Hotz
086382b64e Revert "Fix max nan (#1298)" (#1334)
This reverts commit 50774470b2.
2023-07-23 20:41:28 -07:00
uncommonSensor
50774470b2 Fix max nan (#1298)
* Fix max nan

* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests

* Fix max nan

* Adds nan check option to max function
* Calls to max can pass in "ignore_nan=True" argument
* Added max nan CI tests
* Turned off due to the need for granularity
2023-07-23 19:39:44 -07:00
cheeetoo
a0965ee198 CI < 5 minutes (#1252)
* models matrix

* fix typo and install gpu deps

* install llvm deps if needed

* fix

* testops with cuda

* remove pip cache since not work

* cuda env

* install cuda deps

* maybe it will work now

* i can't read

* all tests in matrix

* trim down more

* opencl stuff in matrix

* opencl pip cache

* test split

* change cuda test exclusion

* test

* fix cuda maybe

* add models

* add more n=auto

* third thing

* fix bug

* cache pip more

* change name

* update tests

* try again cause why not

* balance

* try again...

* try apt cache for cuda

* try on gpu:

* try cuda again

* update packages step

* replace libz-dev with zlib1g-dev

* only cache cuda

* why error

* fix gpuocelot bug

* apt cache err

* apt cache to slow?

* opt and image in single runner

* add a couple n=autos

* remove test matrix

* try cuda apt cache again

* libz-dev -> zlib1g-dev

* remove -s since not supported by xdist

* the cache takes too long and doesn't work

* combine webgpu and metal tests

* combine imagenet to c and cpu tests

* torch tests with linters

* torch back by itself

* small windows clang test with torch tests

* fix a goofy windows bug

* im dumb

* bro

* clang with linters

* fix pylint error

* linter not work on windows

* try with clang again

* clang and imagenet?

* install deps

* fix

* fix quote

* clang by itself (windows too slow)

* env vars for imagenet

* cache pip for metal and webgpu tests

* try torch with metal and webgpu

* doesn't work, too long

* remove -v

* try -n=logical

* don't use logical

* revert accidental thing

* remove some prints unless CI

* fix print unless CI

* ignore speed tests for slow tests

* clang windows in matrix (ubuntu being tested in imagenet->c test)

* try manual pip cache

* fix windows pip cache path

* all manual pip cache

* fix pip cache dir for macos

* print_ci function in helpers

* CI as variable, no print_ci

* missed one

* cuda tests with docker image

* remove setup-python action for cuda

* python->python3?

* remove -s -v

* try fix pip cache

* maybe fix

* try to fix pip cache

* is this the path?

* maybe cache pip

* try again

* create wheels dir

* ?

* cuda pip deps in dockerfile

* disable pip cache for clang

* image from ghcr instead of docker hub

* why is clang like this

* fast deps

* try use different caches

* remove the fast thing

* try with lighter image

* remove setup python for cuda

* small docker and cuda fast deps

* ignore a few more tests

* cool docker thing (maybe)

* oops

* quotes

* fix docker command

* fix bug

* ignore train efficientnet test

* remove dockerfile (docker stuff takes too long)

* remove docker stuff and normal cuda

* oops

* ignore the tests for cuda

* does this work

* ignore test_train on slow backends

* add space

* llvm ignore same tests as cuda

* nvm

* ignore lr scheduler tests

* get some stats

* fix ignore bug

* remove extra '

* remove and

* ignore test for llvm

* change ignored tests and durationon all backends

* fix

* and -> or

* ignore some more cuda tests

* finally?

* does this fix it

* remove durations=0

* add some more tests to llvm

* make last pytest more readable

* fix

* don't train efficientnet on cpu

* try w/out pip cache

* pip cache seems to be generally better

* pytest file markers

* try apt fast for cuda

* use quick install for apt-fast

* apt-fast not worth

* apt-get to apt

* fix typo

* suppress warnings

* register markers

* disable debug on fuzz tests

* change marker names

* apt update and apt install in one command

* update marker names in test.yml

* webgpu pytest marker
2023-07-23 13:00:56 -07:00
George Hotz
47f9d82722 test_conv: relax to 0.93 2023-07-23 12:57:29 -07:00
chenyu
aa05495620 symbolic stride (#1326) 2023-07-23 12:41:22 -07:00
Cole Sutyak
2d4e182294 change fetch to allow for local file selection (#1309) 2023-07-23 15:00:16 -04:00
waifairer
7cac5ea16c [GH-1305] Refactor test_dtypes.py to be cleaner (#1306)
Co-authored-by: waifairer <waifairer@gmail.com>
2023-07-21 18:18:02 -04:00
Jacob Pradels
b112edd2c3 Add pylint trailing whitespace rule (#1314) 2023-07-21 13:37:55 -04:00
madt2709
d2c1e8409a Update arange to be (start, stop, step) (#1308) 2023-07-21 00:27:23 -04:00
George Hotz
f45013f0a3 stable diffusion: remove realizes we don't need 2023-07-20 19:53:07 -07:00
George Hotz
9dffc9ba23 Use nevergrad to optimize kernels (try 2) (#1301)
* nevergrad try 2

* touchups

* no ones

* opt fixup

* cleanups

* touchup

* make new optimizer file
2023-07-20 16:46:45 -07:00
George Hotz
50a399ffa3 real world test: relax memory 2023-07-20 14:06:22 -07:00
George Hotz
17830e25da real world tests (#1297)
* real world test

* touchup

* sync device
2023-07-20 10:50:22 -07:00
George Hotz
ca77d6cd72 bfloat16 in LLVM (enough for llama 2) (#1293)
* add bf16 support to LLVM

* bf16 read works
2023-07-19 20:18:32 -07:00
Umut Zengin
74e63fe4ee Added test_chunk and fixed (#1283) 2023-07-19 22:21:26 -04:00
George Hotz
f7b0320d8b add cifar training regression test (#1287)
* add cifar training regression test

* clean up print
2023-07-19 14:17:09 -07:00
George Hotz
45ecae1ab3 Revert "Match Torch speed for sum reduction on M1 (#1187)" (#1286)
This reverts commit 59af9b81c5.
2023-07-19 13:39:16 -07:00
chenyu
120ae74008 Enable JIT test for size 1 tensor (#1285) 2023-07-19 11:06:40 -07:00
chenyu
940b6fd21a Revert "Fix constant folding for Tensor([3]) (#1227)" (#1274)
This reverts commit ab645317c9.
2023-07-19 10:51:06 -07:00
chenyu
0aed3f73da More JIT test cases (#1280)
* More JIT test cases

* test against jit_cache directly

* remove unused
2023-07-19 10:45:43 -07:00
George Hotz
d6637623e3 torch test touchup 2023-07-19 09:37:23 -07:00
Alexander Edwards
59af9b81c5 Match Torch speed for sum reduction on M1 (#1187)
* Add additional kernel when reducing multiple dimensions at once.

* Faster for smaller inputs

* Whitespace and naming

* Cleaner, guard for Metal only, and max 1 split rather than N

* Draft of different approach

* One additional kernel call for this test (as expected)
2023-07-19 09:18:58 -07:00
Umut Zengin
fde9f0e60d Slice migrated in Eye op (#1281)
* Migrated from slice to pad and shrink, made cleaner

* Changed repeat with reshape and expand
2023-07-19 09:08:38 -07:00
chenyu
a5f5330d91 Add Fuzz Test symbolic / shapetracker to CI. (#1278)
* Fuzz test symbolic and shapetracker

This reverts commit d5773ddebff54c1ff608838076f0b4ff126b8aa8.

* mess again

* no tail

* test shapetracker too

* Revert mess and enable all tests

* removed leftover
2023-07-19 09:05:45 -07:00