response filter (#1039)

* response filter

* rewrite implement based on the filter

* multi responses

* abs path

* code handling

* option to not use docker

* context

* eval_only -> raise_error

* notebook

* utils

* utils

* separate tests

* test

* test

* test

* test

* test

* test

* test

* test

* **config in test()

* test

* test

* filename
This commit is contained in:
Chi Wang
2023-05-21 15:22:29 -07:00
committed by GitHub
parent 7de4eb347d
commit e463146cb8
21 changed files with 2253 additions and 1820 deletions

View File

@@ -128,10 +128,10 @@ print(eval_with_generated_assertions(oai.Completion.extract_text(response), **tu
You can use flaml's `oai.Completion.test` to evaluate the performance of an entire dataset with the tuned config.
```python
result = oai.Completion.test(test_data, config)
result = oai.Completion.test(test_data, **config)
print("performance on test data with the tuned config:", result)
```
The result will vary with the inference budget and optimization budget.
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai.ipynb)
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)