Commit Graph

21 Commits

Author SHA1 Message Date
Diogo
d17ecccd78 Torch/LLVM/arm F64 support (#1551) 2023-08-15 21:21:08 -04:00
geohotstan
07b79f210f llvmir support for bool <-> float casting (#1492) 2023-08-09 13:12:52 -04:00
Diogo
d7d1011f1e Add WEBGPU tests to CI (#1463)
* webgpu tests

* assert device is webgpu

* missed env set

* exclude failing ci tests

* ignore test file

* changed acc for adam test
2023-08-06 10:32:01 -07:00
George Hotz
d67e248d9b simple bitcast 2 (#1445)
* simple bitcast 2

* bc 2

* empty

* Revert "empty"

This reverts commit d8ee083655.
2023-08-06 00:30:50 -07:00
cheeetoo
a0965ee198 CI < 5 minutes (#1252)
* models matrix

* fix typo and install gpu deps

* install llvm deps if needed

* fix

* testops with cuda

* remove pip cache since not work

* cuda env

* install cuda deps

* maybe it will work now

* i can't read

* all tests in matrix

* trim down more

* opencl stuff in matrix

* opencl pip cache

* test split

* change cuda test exclusion

* test

* fix cuda maybe

* add models

* add more n=auto

* third thing

* fix bug

* cache pip more

* change name

* update tests

* try again cause why not

* balance

* try again...

* try apt cache for cuda

* try on gpu:

* try cuda again

* update packages step

* replace libz-dev with zlib1g-dev

* only cache cuda

* why error

* fix gpuocelot bug

* apt cache err

* apt cache to slow?

* opt and image in single runner

* add a couple n=autos

* remove test matrix

* try cuda apt cache again

* libz-dev -> zlib1g-dev

* remove -s since not supported by xdist

* the cache takes too long and doesn't work

* combine webgpu and metal tests

* combine imagenet to c and cpu tests

* torch tests with linters

* torch back by itself

* small windows clang test with torch tests

* fix a goofy windows bug

* im dumb

* bro

* clang with linters

* fix pylint error

* linter not work on windows

* try with clang again

* clang and imagenet?

* install deps

* fix

* fix quote

* clang by itself (windows too slow)

* env vars for imagenet

* cache pip for metal and webgpu tests

* try torch with metal and webgpu

* doesn't work, too long

* remove -v

* try -n=logical

* don't use logical

* revert accidental thing

* remove some prints unless CI

* fix print unless CI

* ignore speed tests for slow tests

* clang windows in matrix (ubuntu being tested in imagenet->c test)

* try manual pip cache

* fix windows pip cache path

* all manual pip cache

* fix pip cache dir for macos

* print_ci function in helpers

* CI as variable, no print_ci

* missed one

* cuda tests with docker image

* remove setup-python action for cuda

* python->python3?

* remove -s -v

* try fix pip cache

* maybe fix

* try to fix pip cache

* is this the path?

* maybe cache pip

* try again

* create wheels dir

* ?

* cuda pip deps in dockerfile

* disable pip cache for clang

* image from ghcr instead of docker hub

* why is clang like this

* fast deps

* try use different caches

* remove the fast thing

* try with lighter image

* remove setup python for cuda

* small docker and cuda fast deps

* ignore a few more tests

* cool docker thing (maybe)

* oops

* quotes

* fix docker command

* fix bug

* ignore train efficientnet test

* remove dockerfile (docker stuff takes too long)

* remove docker stuff and normal cuda

* oops

* ignore the tests for cuda

* does this work

* ignore test_train on slow backends

* add space

* llvm ignore same tests as cuda

* nvm

* ignore lr scheduler tests

* get some stats

* fix ignore bug

* remove extra '

* remove and

* ignore test for llvm

* change ignored tests and durationon all backends

* fix

* and -> or

* ignore some more cuda tests

* finally?

* does this fix it

* remove durations=0

* add some more tests to llvm

* make last pytest more readable

* fix

* don't train efficientnet on cpu

* try w/out pip cache

* pip cache seems to be generally better

* pytest file markers

* try apt fast for cuda

* use quick install for apt-fast

* apt-fast not worth

* apt-get to apt

* fix typo

* suppress warnings

* register markers

* disable debug on fuzz tests

* change marker names

* apt update and apt install in one command

* update marker names in test.yml

* webgpu pytest marker
2023-07-23 13:00:56 -07:00
waifairer
7cac5ea16c [GH-1305] Refactor test_dtypes.py to be cleaner (#1306)
Co-authored-by: waifairer <waifairer@gmail.com>
2023-07-21 18:18:02 -04:00
George Hotz
ca77d6cd72 bfloat16 in LLVM (enough for llama 2) (#1293)
* add bf16 support to LLVM

* bf16 read works
2023-07-19 20:18:32 -07:00
Diogo
a9a1df785f Webgpu support (#1077)
* initial commit

* 81 passing

* 105 passing tests

* 148 passing

* CI tests

* install dep on ci

* try opencl pkgs

* try using vulkan

* down to only 6 failing

* refactor

* cleaning up

* another test skipped due to buffer limit

* linter

* segfault

* indent fix

* another segfault found

* small touchups

* Fix max and maxpool tests

* Add constant folding

* Add javascript export script

* better asserts in codegen

* manual upcasting

* reverted token type change

* skip safetensor test due to unsupported type

* FIx efficientnet and all other model tests

* Remove np copy

* fixed indent and missing import

* manually destroy the buffer

* revert back to length

* linter errors

* removed extra val

* skip broken tests

* skipping more tests

* Make the page pretty

* Save model weights as safetensor

* Fix imagenet to c test

* Fix second imagenet to c bug

* Async and paralel kernel compilation

* workgroup support

* reversed local size

* fixed non local bug

* correct local groups

* ci experiment

* removed typo

* Fix define local by using shared memory

* Refactor

* try running on mac

* match metal tests

* add more workers

* scope down tests

* trying windows runner

* fixed windows env

* see how many it can do

* merged master

* refactor

* missed refactor

* increase test suite coverage

* missing import

* whitespace in test_efficientnet.py

* getting there

* fixed reset

* fixed bufs

* switched to cstyle

* cleanup

* min/max rename

* one more linter issue

* fixed demo

* linter

* testing ci chrome

* add unsafe webgpu arg

* add build step

* remove WEBGPU from cmd line

* use module

* try forcing directx

* trying forced metal backend

* temp disable conv2d for CI

* disable conv_trasnpose2d

---------

Co-authored-by: 0x4d - Martin Loretz <20306567+martinloretzzz@users.noreply.github.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-07-12 12:52:06 -07:00
Yosef Frost
613bcd945d Added Test Coverage to Int32 and Make Sure Tests Succeed (#1174)
* Added test coverage for int32 in `test/test_dtype.py`

Tests for int32 include:
- testing that int32 can be converted into a numpy array
- testing that float and int64 can be cast into int32
- testing that int32 can be cast into float and int64
- testing addition, multiplication, and matrix multiplication with int32
- testing that addition, multiplication, and matrix multiplication with int32 and either float or int64 gets successfully cast into float and int64, respectively

Additional changes include testing that int8 casts into int32 and testing that float16 casts into int32

* Added type casting to the add, subtract, and divide binary operations

* Added automatic type casting when types differ to FusedOps.MULACC

I moved the match_types function back so that I could call it in einsum_mulacc where it would cast the types of the MULACC to be the same

* Added unit test for match_types and added type hints to the parameters

* Added tests for ops_cpu.match_types

* Changed ops_cpu.einsum logic to play nicely with PyTorch

Changed `tinygrad.runtime.ops_cpu.einsum_mulacc` logic to not perform type matching. Type matching was instead moved to the numpy_fxn_for_op dictionary in the ops_cpu file. Since ops_torch uses the same einsum_mulacc function, this should fix all the broken pytorch tests.

* empty commit to rerun ci

* reverting PR#1213 in attempt to fix broken test

* Removed all tests I added to see if they are causing CI issues

* Added back type matching tests

* removed type matching tests and added back int tests

* added back part of the type matching tests

* removed braking type matching tests

* empty commit for testing

* added test back but inside comment

* removed a test from the comment to see if it breaks CI

* removed another function

* more testing

* emptied test comment

* cleaned up comments

* Added optimize=True flag to einsum_mullac in cpu_ops.py

* Removed unnecessary imports from tests

* optimized match_types by removing unnecessary array copying
2023-07-12 10:29:15 -07:00
Reza Rezvan
535224ac20 Remove float64 (#1101)
* Refactor: Remove float64

* Refactor: Remove unused imports

* Refactor: Remove float64

* Refactor: Remove float64

* Refactor: Exclude float64 onnx backend

* Add: Skip jacobian and gradcheck tests;
2023-07-04 08:40:51 -07:00
cloud11665
2407690d82 add cuda on cpu tests (#1020) 2023-06-22 14:15:50 -07:00
George Hotz
039f0d372f delete ltypes (#984)
* delete ltypes

* only upcast float types

* test dtype on mac passes

* ugh, these upcasts
2023-06-15 16:24:45 -07:00
Diogo
0629791cbd F64 support (#976)
* initial commit

* added osx check for opencl

* added llvm f64 conversions

* typo in llvmir

* more tests and modified unsupported error

* fixed linting error

* added pragma fp64

* simplified exclusion for OSX

* fixed device check and also added it to cast func

* added ifdef check for fp16 in ops_gpu

* Revert "added ifdef check for fp16 in ops_gpu"

This reverts commit 92de754d48.

* f64 prekernel signature match f16

* moved condition to buffer init
2023-06-13 21:31:31 -07:00
Diogo
1272d8526a Llvm int support (#866)
* added int val support to llvm

* lint fix

* added types

* fix merge issues
2023-05-30 17:49:26 -07:00
Diogo
0dab8edc97 support Int64 type in cstyle gen (#860)
* added metal int64 and some simple tests

* removed bool return type def

* typo in test

* also missing in clang and gpu runtimes

* switched order for opencl

* increased atol and removed new line in kernel prefix
2023-05-30 16:04:46 -07:00
wozeparrot
2fd2fb6380 int8/uint8 support (#837)
* feat: int8 support

* feat: uint8 support

* feat: int8 tests

* fix: fix uint8 on clang

* feat: test casting between int8/uint8/float16/float32

* clean: way cleaner dtype tests

* feat: preprocess_imagenet using the correct dtype

* feat: add test for overflow between uint8 and int8
2023-05-28 23:15:06 -07:00
Jacky Lee
fafe8e9ce2 casting: support all backends and implement half (#726)
* casting: support all backends and implement half

* map torch types in ops_torch

* reuse type map for torch buffer

* inverse dict lookup
2023-03-24 09:58:03 -07:00
Jacky Lee
e009b6f341 Add tests for casting (#724)
* Add tests for casting

* Skip half_matmul_upcast when TORCH=1

* Fix promotion on torch

* Fix spacing
2023-03-23 08:02:52 -07:00
George Hotz
5495c7d64e linearizer! (#714)
* linearizer outputs something

* working ish

* cstyle codegen

* clang mostly works

* fix load valid

* fix numberless loop

* fancy gen

* working

* fix enet compiler

* cleanups

* float4 upcasting

* less lines

* supports_float4

* constant folding

* mulacc

* internet tests flaky in CI

* 90% image support

* fix image generic

* bugs exposed with shapetracker and single view

* new llvm

* use vload, remove OLD

* that's really poorly done

* ending up being more lines
2023-03-19 23:43:49 -07:00
George Hotz
dc9a6b4bb7 fix float16 in CLANG on linux 2023-03-11 21:51:22 -08:00
George Hotz
1826ff6b89 dtypes nice and clean (#673)
* add dtype class

* dtypes

* buffers are lazy

* dtype is tracked by lazybuffer and GenericShape

* fix types in llvm

* llvm store

* dtype tests

* fix tests maybe

* fix flop counter

* fix CI

* CI fix and check format

* fix dtype and dtype check

* fix custom test

* fix test graph
2023-03-10 16:56:07 -08:00