use BEAM=2 instead of BEAM=4 in cuda ci gpt2 (#3089)

BEAM=2 is faster and less search time. investigating why BEAM2+BEAM4 is slower than BEAM2 alone
2026-04-29 03:00:14 -04:00 · 2024-01-11 13:21:06 -05:00
parent f502c9b08f
commit 93e3f952aa
1 changed files with 1 additions and 1 deletions
--- a/.github/workflows/benchmark.yml
+++ b/.github/workflows/benchmark.yml
@@ -87,7 +87,7 @@ jobs:
    - name: Run GPT2 w HALF
      run: CUDA=1 JIT=1 HALF=1 python3 examples/gpt2.py --count 10 --temperature 0 --timing
    - name: Run GPT2 w HALF/BEAM
-      run: CUDA=1 JIT=1 HALF=1 BEAM=4 CACHELEVEL=0 python3 examples/gpt2.py --count 10 --temperature 0 --timing | tee gpt2_half_beam.txt
+      run: CUDA=1 JIT=1 HALF=1 BEAM=2 CACHELEVEL=0 python3 examples/gpt2.py --count 10 --temperature 0 --timing | tee gpt2_half_beam.txt
    - uses: actions/upload-artifact@v4
      with:
        name: Speed (NVIDIA)