Commit Graph

6529 Commits

Author SHA1 Message Date
chenyu
c398f2467c test uop mul min/max do not have nan in 0*inf (#7340) 2024-10-28 17:52:01 -04:00
chenyu
0843734927 clean up nan handling in transcendental (#7332)
* clean up nan handling in transcendental

* skip remu crash
2024-10-28 16:21:49 -04:00
Sieds Lykles
75dcd98e79 Fix calculation of vmin and vmax in multiplication when one src is negative and the other src has negative min and positive max (#7333)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-10-28 16:01:46 -04:00
chenyu
603fcc96f2 limit UOps.ALU min/max to non-float only (#7336)
does this impact anything? some inf is incorrect now
2024-10-28 15:34:19 -04:00
ignaciosica
32fa297e6c cleaner nan rendering (#7337) 2024-10-28 14:36:36 -04:00
qazal
00362a117c scheduler bfs renames [pr] (#7335) 2024-10-29 00:24:23 +08:00
qazal
d8820644e0 split preschedule from ast rewrite [pr] (#7334) 2024-10-28 17:45:09 +02:00
chenyu
6b0e8cb04f remove float_to_bits in transcendental [pr] (#7331)
it's just bitcast, and removed the weird bits_to_float indirection
2024-10-28 10:20:19 -04:00
qazal
b9b28e6883 viz stuff [pr] (#7330)
* viz stuff [pr]

* button
2024-10-28 21:46:18 +08:00
Bhavya Gada
9b7e76e508 VIZ UI improvement: resizable and collapsible sidebars (#7317)
* make left sidebar resizable

* add sidebar collapse/expand button

* refactor to reduce loc and make resize work correctly

* combine both resizers
2024-10-28 21:19:43 +08:00
qazal
e46edc22aa use unittest helpers in TestTensorMetadata [pr] (#7329)
* use unittest helpers in TestTensorMetadata [pr]

* fix that

* 5 args
2024-10-28 18:38:30 +08:00
chenyu
96fcc47e27 touchup abstraction docs (#7327)
fix typing and use tinygrad tqdm
2024-10-27 22:29:55 -04:00
chenyu
cb5702f170 tiny cleanup to transcendental xexp2 (#7326)
also added test for exp and log of nan and inf
2024-10-27 21:54:20 -04:00
chenyu
4c855ae692 unit test transcendental helpers (#7325)
added a test to run UOps with const inputs. seems to have issue with both payne_hanek_reduction and cody_waite_reduction
2024-10-27 19:55:00 -04:00
qazal
8d9459f281 always run process replay with contextvars (#7323)
* always run process replay with contextvars [pr]

* not the last two

* extra

* no pr
2024-10-27 20:44:42 +02:00
qazal
adcdaa17bb map BUFFER to Metadata [pr] (#7324) 2024-10-27 20:31:04 +02:00
qazal
d634261c51 late buffer uops [pr] (#7322) 2024-10-27 19:34:01 +02:00
chenyu
cdbe08b94b use UOp.render in colored_shape (#7321)
similar to function name, print rendered str instead of raw UOp
2024-10-27 11:42:31 -04:00
chenyu
4a03e00aa1 fix llama3 download_model assert (#7320)
false positive if download_model and model are not provided
2024-10-27 11:20:24 -04:00
talati
d4d201d87b fixing branch condition on UOps.IF in the ptx renderer (#7315)
* fixing branch condition on UOps.IF in the ptx renderer

* ptx works

---------

Co-authored-by: Nick Talati <nick.talati@quantworks.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
Co-authored-by: qazal <qazal.software@gmail.com>
2024-10-27 14:27:38 +02:00
qazal
a410b46c1d unskip test_gated_store_with_if [pr] (#7319) 2024-10-27 14:03:12 +02:00
Maximilian Wolf
3c992250d5 Failing test: different behavior on different devices (#7193)
* add minimal failing test

* more tiny makes linter happy

* tinyfy

* no walrus in assert

* a tiny bit simpler

* minimal

* better place, better name, expected failure

* skip devices with correct behavior
2024-10-27 09:53:58 +08:00
eliotgolding
e920f1d663 Llama 3.2 1B load from GGUF (#7295)
* gguf 1b-instruct

* not needed
2024-10-27 09:29:02 +08:00
chenyu
d66fe7a66f fix simplify_valid (#7313)
the simplex should compare with valid bound, not its vmin
2024-10-26 14:21:12 -04:00
chenyu
0a4d01f6d4 disable simplify_valid (#7312)
fixed test_failure_55. will reenable it later after fixing the bug
2024-10-26 12:42:48 -04:00
nimlgen
293714610a capture beam log runtime errors (#7311) 2024-10-26 13:59:45 +03:00
nimlgen
3c62315aa8 add resnet pf (#7310)
* add resnet pf

* all platforms
2024-10-26 13:20:32 +03:00
nimlgen
68cd2c0669 nv correct local memory based on device (#7307)
* nv correct local memory based on device

* linter

* oops

* oops2
2024-10-25 22:23:42 +03:00
chenyu
2ddfb9678a update exponent_bias in transcendental (#7304)
from https://en.wikipedia.org/wiki/Exponent_bias, 15, 127, 1023 are bias
2024-10-25 10:45:49 -04:00
chenyu
e7cd21c5e3 remove custom render in test_simplify_valid_idx (#7303)
use UOp render to compare
2024-10-25 10:20:26 -04:00
chenyu
4688c01e3e transcendental cleanups (#7301)
simplified polyN and some redundant  line cleanups
2024-10-25 09:30:25 -04:00
George Hotz
aadf688aeb order flipper as *normal* rewrite rule (#7300)
* instant isn't actually used [pr]

* order flipper as *normal* rewrite rule

* fix inf loop

* need simplify now
2024-10-25 21:28:30 +08:00
George Hotz
3c31497f55 instant isn't actually used [pr] (#7299)
* instant isn't actually used [pr]

* tolerance bump
2024-10-25 21:01:29 +08:00
George Hotz
199a991237 line reduction [pr] (#7296) 2024-10-25 17:05:09 +07:00
George Hotz
4fed358511 hotfix: timeouts to 20 minutes. better no stats update than a red x 2024-10-25 16:31:52 +08:00
George Hotz
dc3148c677 hotfix: minor speed increase + stable diffusion relax 2024-10-25 16:27:21 +08:00
George Hotz
4812801aa6 try for canonical order (#7286)
* try for canonical order

* cmp better

* disable bad tests

* flip const order

* fix test

* fix tests

* different fix for NOOP

* metaclass here

* fix tests

* narrower scope
2024-10-25 16:04:54 +08:00
George Hotz
d3500af71b move consts last in uop toposort (#7290)
* move consts last in uop toposort

* consts first in toposort
2024-10-25 14:58:48 +08:00
qazal
e3c9c94896 Revert "move anything that isn't bfs [pr] (#7273)" (#7289)
This reverts commit b805711f86.
2024-10-25 14:38:30 +08:00
qazal
0b47eca085 schedule.py reorders [pr] (#7285)
* schedule.py reorders [pr]

* diff

* more renames
2024-10-25 14:30:23 +08:00
George Hotz
004af512e6 try all matches in the function (#7288) 2024-10-25 14:17:04 +08:00
George Hotz
bcf0537653 canonicalize the order prereqs (#7283)
* canonicalize the order

* don't change that yet

* that order isn't safe with uops
2024-10-25 11:37:51 +08:00
qazal
603d637105 split to fuse.py and schedule.py [pr] (#7284) 2024-10-25 06:17:24 +03:00
qazal
698457c5ce big graph ScheduleContext [pr] (#7282) 2024-10-25 05:58:23 +03:00
qazal
b805711f86 move anything that isn't bfs [pr] (#7273) 2024-10-25 05:34:21 +03:00
George Hotz
6dc7d3c949 instant uop rules [pr] (#7263)
* instant uop rules [pr]

* real instant

* only instant folder

* better diff

* instant means instant

* Revert "instant means instant"

This reverts commit e58d9161bf.
2024-10-25 10:32:45 +08:00
chenyu
90f720d703 limit idiv by neg bound to only if s0 is non-negative [pr] (#7277)
also updated the tests when div by negative const
2024-10-24 15:46:50 -04:00
chenyu
d4c94d0d32 disable llama 1 4gpu and 6gpu benchmark (#7276)
having llama3 4gpu and 6gpu should be good enough
2024-10-24 14:19:22 -04:00
chenyu
e6929f2402 RUN_PROCESS_REPLAY=0 on llama 70B and resnet training (#7272)
* RUN_PROCESS_REPLAY=0 on llama 70B and resnet training

also added a 15 minutes total timeout, this cannot grow indefinitely

* add a few more

* a few more just for NV
2024-10-24 12:09:54 -04:00
chenyu
b777cfdcba update test_max_simplify_and_cancel (#7270)
it's fixed and no longer dumb
2024-10-24 10:29:05 -04:00