AMD-SHARK-Studio

mirror of https://github.com/nod-ai/AMD-SHARK-Studio.git synced 2026-02-19 11:56:43 -05:00

Files

Jakub Kuderski 2da31c4109 [vicuna.py] Rework benchmark statistics calculation (#1992 )

- Move statistics out of the main loop
- Add 'end-to-end' numbers
- Switch the main display unit from s to ms
- Start measuring time at 0

The new print format looks like this:
```
Number of iterations: 5
Num tokens: 1 (prompt), 512 (generated), 513 (total)
Prefill: avg. 0.01 ms (stdev 0.00), avg. 97.99 tokens/s
Decode: avg. 4840.44 ms (stdev 28.80), avg. 97.99 tokens/s
Decode end-2-end: avg. 85.78 tokens/s (w/o prompt), avg. 95.98 (w/ prompt)
```

2023-11-23 12:04:03 -05:00

language_models

[vicuna.py] Rework benchmark statistics calculation (#1992 )

2023-11-23 12:04:03 -05:00

shark_studio

(SHARK Studio) Add Turbine-based llm chatbot. (#1933 )

2023-11-14 09:56:28 -06:00

stable_diffusion

Add .mlir to startup shark_tmp cleanup (#1991 )

2023-11-22 14:34:28 -06:00

__init__.py

[SD] Reorganize the stable diffusion model. (#806 )

2023-01-31 14:42:41 -08:00