Commit Graph

173 Commits

Author SHA1 Message Date
Szymon Ożóg
e8a1bcdeb5 print used device 2023-08-22 13:16:10 +02:00
Szymon Ożóg
7839657b42 use cudacpu in tests 2023-08-22 13:06:02 +02:00
Szymon Ożóg
f6c99e84eb disable cuda backend in triton tests 2023-08-22 12:59:11 +02:00
Szymon Ożóg
5c6edfb064 use triton tests 2023-08-22 10:54:07 +02:00
Szymon Ożóg
fbb3793b7a Add triton to existing testing routine 2023-08-22 10:32:58 +02:00
Szymon Ożóg
39d238734b remove pytorch cpu extra index 2023-08-22 09:47:45 +02:00
Szymon Ożóg
3a87a32b23 ignore test example 2023-08-22 09:03:07 +02:00
Szymon Ożóg
4220908646 Old testing routine 2023-08-22 08:21:10 +02:00
Szymon Ożóg
bf092c55ac fix envs in testing 2023-08-22 08:10:14 +02:00
Szymon Ożóg
8fcc25d0aa Merge triton tests into global tests 2023-08-21 18:08:01 +02:00
Szymon Ożóg
2bdd60565c Merge remote-tracking branch 'upstream/master' into triton 2023-08-21 17:53:17 +02:00
Roelof van Dijk
1900acda09 [READY] ci: setup venv cache (#1475)
* ci: cache installed packages

* ci: trigger jobs

* ci: fix hashfiles argument

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-20 18:43:16 -07:00
George Hotz
012ee7d162 not worth the speed (#1584)
* not worth the speed

* no slots

* uops comments

* bump to python 3.11 for speed

* add critical slots back
2023-08-20 10:24:58 -07:00
Szymon Ożóg
59dc1ad772 Enable test_nn 2023-08-20 10:31:50 +02:00
George Hotz
ad7d26c393 fix __launch_bounds__ and benchmark TC MATMUL (#1575)
* fix

* benchmark matmul
2023-08-19 10:54:39 -07:00
chenyu
ae39cf84ab Symbolic Shape JIT main PR (#1353)
* Symbolic Shape JIT

update tests

2 variables symbolic ops, adding more tests

test passing

cleanup

* more test cases

* single flag

* review update

* jit attention one piece

* realize

* symbolic_jit test for cuda

* old artifact

* works with cuda gpu but failed ci

* CUDACPU
2023-08-18 14:39:55 -07:00
Roelof van Dijk
84e6693915 fix: apt-get to apt, no recommends, clean up (#1571)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-18 13:48:59 -07:00
Szymon Ożóg
5430fcb4e9 cuda envs 2023-08-18 17:49:48 +02:00
Szymon Ożóg
2cfc7121b1 remove emulated from triton tests 2023-08-18 17:40:17 +02:00
Szymon Ożóg
4af5a60caf merge double package install 2023-08-18 17:38:22 +02:00
Szymon Ożóg
4362ebb547 install cuda packages for testing 2023-08-18 17:05:45 +02:00
Szymon Ożóg
48c7c9161e Merge test.yml 2023-08-18 16:51:16 +02:00
Ethan Sorrell
cb62911f6b PTX Reintegration and Passing Tests (#1512)
* move assembly, assembly_ptx

* successful but broken rendering of ptx asm

* clear ins before render asm

* slightly less broken :')

* we needed thread syncs

* fix float16 loading, rounding modifiers and other casting stuff, passing casts_from_half

* Fix runtime_args for gpuocelot

* our casts were flipped on both ends

* more casting

* add ternary where op

* dealing with storing/loading bool

* add test for casting to bool from negative

* Fix args.valid on ConstOp

* add to CI, TODO: fix runtime_args for test_uops

* fix placement of runtime_args to work with lazy.Device

* undo ci changes so I can push

* fix lints

* start cleanup and fix things we broke fixing lints

* add checks for PTX specifc asm instructions

* revert added test -- doesn't pass on llvm

* skip tests for underflow,overflow

* another fix for how we're setting runtime args

* Less broken cleanup

* add to CI

* add more env variables for ci test

* fix ci to install pycuda for ptx

* ci: copy cuda test command

* cleanup

* assert to make sure we're actually running ptx in ci

* remove test assert

* move is_ptx arg

* move assembly, assembly_ptx back to extras

* fix imports

* initial merge fixes

* clear registers, fix UOps.LOAD with invalid value

* draft merge fixes

* remove prints

* quick lint and merge fixes

* cleanup

* remove PTXProgram wrapper

* final cleanup

* temp change for ci rerun

* ci rerun

* rollback ISA version
2023-08-16 16:20:20 -07:00
chenyu
11dd9b1741 symbolic codegen and exec (#1552)
* symbolic codegen and exec

* fix and add test

* no sketchy

* merge_dicts type

* dtypes._arg_int32
2023-08-16 14:43:41 -07:00
wozeparrot
074c467020 hotfix for broken ci (#1559) 2023-08-16 13:52:03 -04:00
Szymon Ożóg
7616b27a5b Added triton tests 2023-08-16 06:48:42 +02:00
Steven Anderson
93a36c3659 Arm (#1421)
* testing new memops

* better debugging

* testing padded conv

* branching with load

* refactoring a bit

* first try

* fixing bugs

* fixing some

* eq

* eq2

* do not use x's

* working

* fixing imm

* getting things working

* refactor

* pow not working

* working except one

* refactor: one store mem

* refactor: global load

* refactor: imm

* refactor: cleaning

* fixing big offsets

* refactor with ci

* try ci

* typo

* another typo

* ubuntu default

* forgot git

* do i need git?

* missing packages

* adding python-dev

* with cache?

* buildx action

* buildx name issue?

* maybe now?

* python3

* newline warning

* maybe now

* i actually need this

* ci should work now

* improved caching

* fixing cache

* maybe now it will cache

* this

* testing cache

* trying again

* load

* missing platform

* caching gha

* testing cache

* full testing

* typo

* now?

* why

* adding checkout back

* bad formatting

* fixing convention issues

* supporting python

* adding CI flag

* testing all

* better comments

* adding debugging

* takes 12x longer

* does it output progress now?

* ignore models for speed

* fixing merge

* excluding conv_transpose2d

* only 2 test cuz is to slow

* another approach

* let's see

* faster duh

* my bad

* T_T

* typo

* sup

* with output?

* comment test

* comment test

* comment test

* :?

* no comment

* with cache

* back to normal

* testing that ci works

* back to passing

* trying again

* does it create another entry

* does it create another entry?

* build local

* hey

* Revert "excluding conv_transpose2d"

This reverts commit cc7348de03.

* does it cache if done before?

* does it cache?

* done

* adding test ops

* bad formatting

* no need for this

* working static mem

* sum 1d

* add ndim

* better reg import

* fix stack

* back to np

* working except for softmax

* 5 failing

* no pogress

* remove keystone

* remove keystone

* testops passing

* cleanups

* more cleanup

* typo

* ci

* ci2

* cond import

* ci3

* ci4

* ci4

* ci5

* ci5

* ci6

* aligment

* test all

* correct test

* err read_unmapped

* passing test

* ignore for speed

* ignore for speed

* ci7

* cleanup

* remove docker

* fixing merge

* fixing bugs

* add skipload for const ops

* comments

* First merge to master: Renderer

* fix emulation

* passing all tests arm64

* cleaning

* fix handcoded binary

* cleaning

* fix errs

* fix runtime arg binary

* clean git diff

* fix and clean

* fixing metal test

* cleaning

* fix metal test

* ci ~8 min

* fix pylint and clang

* cache the files in ops_clang

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-08-14 19:29:30 -07:00
wozeparrot
29d5801387 distributed collectives (#1519)
* feat: world

* feat: tests

* feat: no more backwards

* feat: recv into

* feat: whoops

* feat: test in ci

* feat: some debug logging

* feat: workflow naming

* feat: need to set pythonpath

* feat: just send to same device

* feat: allreduce

* feat: test

* feat: need contiguous

* feat: test in ci

* feat: exit with correct code

* feat: don't need that

* feat: opencl wait_for just doesn't work

* feat: synchronize on out

* feat: try?

* feat: try again?

* feat: add extra realizes

* feat: print

* feat: seed

* feat: tol

* feat: test ones and zeros

* feat: remove print

* feat: are you just flaky

* feat: seperate scatter and gather?

* feat: just try synchronizing

* feat: remove print again

* feat: bring back difference

* feat: no sync

* feat: revert that

* feat: back to wait_for

* fix: typo
2023-08-11 10:22:07 -07:00
wozeparrot
7e7c9001e9 distributed world (#1481)
* feat: world

* feat: tests

* feat: no more backwards

* feat: recv into

* feat: whoops

* feat: test in ci

* feat: some debug logging

* feat: workflow naming

* feat: need to set pythonpath

* feat: just send to same device
2023-08-10 10:00:51 -07:00
George Hotz
e3c6c0c6db add GPT2 example (#1511) (#1514)
* add gpt2 to examples

* some cleanup

* fixes

* argparse + scaled_dot_product_attention

* add timing

* add to benchmark

Co-authored-by: YassineYousfi <yassine.y10@gmail.com>
2023-08-10 09:09:47 -07:00
wozeparrot
351684395c dont run on fork (#1510) 2023-08-09 13:06:45 -04:00
wozeparrot
88e2e0c8a3 Revert "don't try to run benchmark on forks" (#1508) 2023-08-09 12:59:49 -04:00
wozeparrot
65b65b760b don't try to run benchmark on forks (#1507) 2023-08-09 12:59:19 -04:00
Roelof van Dijk
aa83a9e910 ci: fix gpuocelot build cache (#1474)
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-08 14:00:04 -07:00
Roelof van Dijk
e2cf0f322e [READY] ci: missing n=auto (#1486)
* ci: missing n=auto

* fix: add to commented test

---------

Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-08 07:37:24 -07:00
George Hotz
5fdd248617 don't download cifar (#1472) 2023-08-06 21:38:59 -07:00
George Hotz
d78fb8f4ed add stable diffusion and llama (#1471)
* add stable diffusion and llama

* pretty in CI

* was CI not true

* that

* CI=true, wtf

* pythonpath

* debug=1

* oops, wrong place

* uops test broken for wgpu

* wgpu tests flaky
2023-08-06 21:31:51 -07:00
Diogo
d7d1011f1e Add WEBGPU tests to CI (#1463)
* webgpu tests

* assert device is webgpu

* missed env set

* exclude failing ci tests

* ignore test file

* changed acc for adam test
2023-08-06 10:32:01 -07:00
George Hotz
486a9dbfd9 speed v torch (#1464)
* speed v torch

* always print

* change print

* torch speed tee

* all exposed
2023-08-06 09:32:33 -07:00
George Hotz
2ab282bfec run on update_benchmark too (#1460)
* run on update_benchmark too

* amd inference test

* name it better

* add 10 CIFAR training steps
2023-08-06 08:58:37 -07:00
George Hotz
943b227cb1 only on push to master 2023-08-06 00:10:07 -07:00
George Hotz
2274e3e757 Fix benchmark (#1454)
* do benchmarking

* system

* artifact

* go

* name artifact

* only on push
2023-08-05 23:44:36 -07:00
George Hotz
bf21aec81f do benchmarking (#1451)
* do benchmarking

* system

* artifact

* go

* name artifact
2023-08-05 23:35:01 -07:00
George Hotz
67781fcf5d fix fail fast in CI 2023-08-05 10:24:24 -07:00
wozeparrot
ab9e4a2e93 Make cuda CI a bit more consistent (#1403)
* feat: use fast-apt-mirror

* feat: use in more places
2023-08-02 07:38:22 -07:00
Diogo
4dc8595069 simple exporting models (#1344)
* unified exporting

* json exporting

* ignore more

* simplified buffer export

* added dtypes

* added assert

* swift example

* fix tests

* linter

* remove whitespace

* fixed tests

* remove swift example

* remove unintended changes

* allow callable models to be used

* whitespace

* more readable json export

* name change

* whitespace

* whitespace
2023-08-01 09:35:48 -07:00
George Hotz
f27df835a6 delete dead stuff (#1382)
* delete bpe from repo

* remove yolo examples

* Revert "remove yolo examples"

This reverts commit cd1f49d466.

* no windows
2023-07-31 11:17:49 -07:00
George Hotz
37fa7e96fb Revert "update editorconfig, enforce via CI (#1343)" (#1380)
This reverts commit da2efecbe2.
2023-07-31 10:35:50 -07:00
Pavol Rusnak
da2efecbe2 update editorconfig, enforce via CI (#1343)
* update editorconfig to set unix-style newlines and trim whitespace

* add editorconfig github action to the CI

* fix whitespace
2023-07-30 18:44:30 -07:00
chenyu
ab80ea0d38 use ubuntu for clang ci test (#1368) 2023-07-28 20:51:25 -04:00