Commit Graph

57 Commits

Author SHA1 Message Date
chenyu
41e45c20ff minor stuff reading the printed code [pr] (#13177) 2025-11-09 00:58:51 -05:00
chenyu
834067d91f move onnx import in compile3 (#13172)
only used in test_vs_onnx
2025-11-08 09:44:34 -08:00
Harald Schäfer
587ccc0e5c compile3: make selftests opt-in (#12851) 2025-10-21 11:32:27 -07:00
wozeparrot
990e8b97ee feat: log openpilot 0.10.1 times (#12816) 2025-10-20 18:30:34 -07:00
Harald Schäfer
addc54b96c Simplify openpilot compile3.py (#12748)
* Simpler compile3

* tests

* remove default args

* onnx file is still fp16

* self-test FP16 too

* allow test disable

* absurd tolerance

* Just do latest

* Try simplest

* use later models

* kernel count not relevant if speed is good

* dead improts

* Revert "dead improts"

This reverts commit f68c2cd15d.

* Revert "kernel count not relevant if speed is good"

This reverts commit 0955ca4ee0.

* add back kernal count check on latest model
2025-10-18 10:12:22 -04:00
George Hotz
612e3d6143 replace mop arg with vectorized index (#12695)
* replace mop arg with vectorized index

* tests passing

* better viz

* no compile4
2025-10-15 20:50:06 +08:00
nimlgen
658c566e22 vars in gated_read_image_count (#12486)
* vars in gated_read_image_count

* nc
2025-10-09 14:54:15 +08:00
chenyu
be05028419 move ASSERT_MIN_STEP_TIME to compile3 (#12535)
threshold is current time +20%
2025-10-08 22:16:59 -04:00
qazal
7e0b14243e delete grouper and kernelize (#12517)
* delete grouper and kernelize

* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00
George Hotz
0f25b4b289 move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
qazal
1af05dae77 fix rangeify in compile4.py (#12467)
* fix rangeify in compile4.py

* fix type_verify
2025-10-06 13:37:46 +03:00
chenyu
0e266f376c ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
George Hotz
842184a1ab rename kernelize to schedule, try 2 (#11305) 2025-07-21 11:18:36 -07:00
chenyu
85ddd72038 simpler grouptop in hcopt (#11219)
* simpler grouptop in hcopt

keep the only perf relevant conditions and the rest is handled by try except

* update openpilot read image count
2025-07-13 16:06:09 -04:00
geohotstan
5ce278b245 OnnxRunner file as input (#10789)
* file path as input and have parse be in OnnxRunner.__init__

* modelproto_to_onnxrunner -> modelproto_to_runner

* whoops, fix import

* oh flakiness again, is it because it's getting gc-ed?

* small changes

* CI flaky so just move compile4 fix in

* copy typing of onnx_load

* actually can just import onnx_load instead of onnx.load

* fix external_benchmark_openpilot

* fix onnx_runner test to use onnx_helper

* rerun CI

* try run_modelproto

* spam CI a few times

* revert run_modelproto since that's flaky also

* no external onnx_load usage except onnx.py

* cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why?

* model_benchmark 193s -> 80s, add OnnxRunner.to()...

* minimize diff and clean up

* device can be None, weird but eh

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-07-12 14:27:46 -04:00
geohotstan
50936b4a18 ONNX real float16 (#10694)
* squash commits

* temp fix for const tensor

* actually realizing float16 can only happen in raw_data

* .float -> cast(float) to rerun CI

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-26 14:05:12 -04:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00
chenyu
7d5c769c6b fix compile4 (#10797) 2025-06-12 22:28:56 -04:00
b1tg
24d328e313 onnx parser (#10435)
* onnx parser

* fix compile, lint

* onnx.load -> onnx_load

* compatible with ModelProto

* fix test external_test_onnx_ops.py

* fix tests

* fix signed int

* reduce to 261 lines

* fix TypeProto.Optional

* debug for _parse_message, add TypeProto.Sequence, cleanup

* onnx_load from Tensor

* remove BufferedReader

* 174 lines and reduce tensor copy

* cleanup

* use onnx_load in external_model_benchmark.py

* fix qcom test

* [onnx] parser support external data

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-09 12:44:28 -04:00
George Hotz
32e9949052 rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
George Hotz
0d39bb5de1 rename to get_kernelize_map (#10465) 2025-05-22 11:44:44 -07:00
George Hotz
577a0b4cfa openpilot compile4 (wip) (#10407)
* openpilot compile4

* add copies

* remove junk
2025-05-22 10:47:34 -07:00
George Hotz
74d98eafb8 add onnx frontend stub [pr] (#9558) 2025-03-24 12:24:34 +08:00
ZwX1616
c977781b3c no numpy change if no NPY (#9281)
* skip np change check if no NPY

* use any
2025-02-28 09:32:35 +08:00
George Hotz
8b16c65bca add compile3 benchmark [pr] (#8929) 2025-02-06 22:49:31 +08:00
geohotstan
dd82b4c913 make onnx runner a class (#8647)
* this

* clean up

* more clean ups and improve debug msg

* more correct training toggler

* remove manual training toggling

* change some variable names

* actually just add the training toggle for LIMIT envvar too

* more refinement

* __call__ and OnnxRunner

* fix half pylint, other half is importing from onnx while this file is onnx.py, figure out later

* ahhhh found another mistake

* remove limit from __call__

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-01-20 10:11:05 -08:00
Harald Schäfer
7059459648 Openpilot compile: fix for openpilot use (#8338)
* compile3 changes

* merge conflict

* merge conflict

* give dm npy for now

* Revert "give dm npy for now"

This reverts commit bfd980da7d2c2bab5b073127442c361922032ba1.

* updates

* Always float32 floats

* Update compile3.py

* Update compile3.py

---------

Co-authored-by: ZwX1616 <zwx1616@gmail.com>
2024-12-19 19:43:15 -05:00
chenyu
26e049ab40 add ALLOWED_READ_IMAGE=2131 to openpilot (#8166)
added as exact number check now as it's not clear if more/less than allowed is any better
2024-12-11 12:14:17 -08:00
George Hotz
f83d715f41 move checks into compile3, delete compile2 [pr] (#8127)
* move checks into compile3 [pr]

* test_vs_onnx

* test v torch works

* float16 won't compile on compile3

* actually delete compile2
2024-12-09 14:21:42 -08:00
George Hotz
00ac0db9d4 np tensors have the memory from numpy in compile3 [pr] (#8098) 2024-12-07 14:01:51 +08:00
George Hotz
22feb3a2f1 move copy into the JIT for openpilot compile3 (#7937)
* move copy into the JIT, test fails

* ahh, prune was the issue
2024-12-07 13:26:26 +08:00
George Hotz
fbb4099b3c add test for compile3 [pr] (#7783)
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2024-11-19 19:26:51 +08:00
Harald Schäfer
e7cbc29f48 openpilot benchmark: add cast from numpy to benchmark (#7593)
* openpilot benchmark: add cast from numpy to benchmark

* whitespace

* comment
2024-11-08 19:31:00 +08:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
George Hotz
72a9ac27e9 support image dtype in cloud [pr] (#7482)
* support image dtype in cloud [pr]

* remove outdated osx hack

* unused imports
2024-11-02 23:54:27 +08:00
George Hotz
26df50cf43 move memory_planner to memory.py [pr] (#7079) 2024-10-16 10:04:35 +08:00
George Hotz
5c9f76e274 hotfix: openpilot compile3 compare to i==1 2024-10-12 09:44:24 +08:00
George Hotz
f45d178a55 hotfix: support JIT_BATCH_SIZE=0, make that the default 2024-09-25 10:36:04 +08:00
George Hotz
b9e6d42a1f Revert "gated native math in OpenCL (#6683)" (#6691)
This reverts commit 2fe3eeed17.
2024-09-24 08:48:10 +08:00
George Hotz
2fe3eeed17 gated native math in OpenCL (#6683)
* gated native math

* Update cstyle.py
2024-09-23 19:22:13 +08:00
chenyu
b14c1bc417 UOps.RANGE is_increasing (#6615)
* UOps.RANGE is_increasing

283 -> 47 valids

* test
2024-09-20 03:14:52 -04:00
George Hotz
d02bb270b7 add copyin copyout for image on GPU [run_process_replay] (#6580)
* add copyin copyout for image on GPU [run_process_replay]

* add timing

* enqueue vs total run

* it's failing but that's fine
2024-09-18 16:06:20 +08:00
George Hotz
d4b662c318 new openpilot compile (#6573)
* new openpilot compile

* note, copyout doesn't work for images
2024-09-18 14:22:50 +08:00
chenyu
798be6bb74 add gated read_image count in openpilot compile2 (#6546)
530 to go
2024-09-16 21:17:00 -04:00
qazal
28c75bf2a6 merge uops with ops (#6111)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-08-16 18:17:57 -04:00
qazal
c23d44c779 AST is UOp (#6030)
* most of the work from the uops2 branch

* schedule

* realize

* kernel

* lowerer

* search

* green

* merge uops with ops

* Revert "merge uops with ops"

This reverts commit 1408a59f12.

* fix benchmark

* remove extra dedup
2024-08-16 22:09:00 +03:00
George Hotz
e077bc7baf move memory planner to realize (#5937) 2024-08-06 10:41:29 -07:00
George Hotz
fa7e734b49 MetaOps.KERNEL (#5543) 2024-07-17 19:41:23 -07:00
chenyu
4df63da190 clean up rest of the loadop [run_process_replay] (#5440)
to metaop and filter_sink
2024-07-12 23:38:51 -04:00