mirror of
https://github.com/ROCm/ROCm.git
synced 2026-02-21 03:00:39 -05:00
This generates identical PTX for floating point, but for integer types the resulting PTX is much better. For example `tl.abs` for int16 currently generates ```mlir cvt.s32.s16 %r1, %rs2; neg.s16 %rs4, %rs2; setp.lt.s32 %p4, %r1, 0; selp.b16 %rs3, %rs4, %rs2, %p4; ``` After, it becomes a single `abs.s16` instruction. This also improves LLVM's ability to optimize floats. e.g. `abs(t) * abs(t)` is optimized to `t * t` now which didn't happen before. --------- Co-authored-by: Keren Zhou <kerenzhou@openai.com>
150 lines
1.6 KiB
ReStructuredText
150 lines
1.6 KiB
ReStructuredText
triton.language
|
|
===============
|
|
|
|
.. currentmodule:: triton.language
|
|
|
|
|
|
Programming Model
|
|
-----------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
program_id
|
|
num_programs
|
|
|
|
|
|
Creation Ops
|
|
------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
arange
|
|
zeros
|
|
|
|
|
|
Shape Manipulation Ops
|
|
----------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
broadcast_to
|
|
reshape
|
|
ravel
|
|
|
|
|
|
|
|
Linear Algebra Ops
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
dot
|
|
|
|
|
|
Memory Ops
|
|
----------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
load
|
|
store
|
|
atomic_cas
|
|
atomic_xchg
|
|
|
|
|
|
Indexing Ops
|
|
------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
where
|
|
|
|
|
|
Math Ops
|
|
--------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
abs
|
|
exp
|
|
log
|
|
cos
|
|
sin
|
|
sqrt
|
|
sigmoid
|
|
softmax
|
|
|
|
|
|
Reduction Ops
|
|
-------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
max
|
|
min
|
|
sum
|
|
|
|
|
|
Atomic Ops
|
|
----------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
atomic_cas
|
|
atomic_add
|
|
atomic_max
|
|
atomic_min
|
|
|
|
|
|
Comparison ops
|
|
--------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
minimum
|
|
maximum
|
|
|
|
.. _Random Number Generation:
|
|
|
|
Random Number Generation
|
|
------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
randint4x
|
|
randint
|
|
rand
|
|
randn
|
|
|
|
|
|
Compiler Hint Ops
|
|
-----------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
multiple_of
|