Deploy a new doc website (#338)

A new documentation website. And:

* add actions for doc

* update docstr

* installation instructions for doc dev

* unify README and Getting Started

* rename notebook

* doc about best_model_for_estimator #340

* docstr for keep_search_state #340

* DNN

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Z.sk <shaokunzhang@psu.edu>
This commit is contained in:
Chi Wang
2021-12-16 17:11:33 -08:00
committed by GitHub
parent 671ccbbe3f
commit efd85b4c86
91 changed files with 12277 additions and 752 deletions

View File

@@ -0,0 +1,88 @@
# Contributing
This project welcomes (and encourages) all forms of contributions, including but not limited to:
- Pushing patches.
- Code review of pull requests.
- Documentation, examples and test cases.
- Readability improvement, e.g., improvement on docstr and comments.
- Community participation in [issues](https://github.com/microsoft/FLAML/issues), [discussions](https://github.com/microsoft/FLAML/discussions), and [gitter](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge).
- Tutorials, blog posts, talks that promote the project.
- Sharing application scenarios and/or related research.
You can take a look at the [Roadmap for Upcoming Features](https://github.com/microsoft/FLAML/wiki/Roadmap-for-Upcoming-Features) to identify potential things to work on.
Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit <https://cla.opensource.microsoft.com>.
If you are new to GitHub [here](https://help.github.com/categories/collaborating-with-issues-and-pull-requests/) is a detailed help source on getting involved with development on GitHub.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Becoming a Reviewer
There is currently no formal reviewer solicitation process. Current reviewers identify reviewers from active contributors. If you are willing to become a reviewer, you are welcome to let us know on gitter.
## Developing
### Setup
```bash
git clone https://github.com/microsoft/FLAML.git
pip install -e .[test,notebook]
```
### Docker
We provide a simple [Dockerfile](https://github.com/microsoft/FLAML/blob/main/Dockerfile).
```bash
docker build git://github.com/microsoft/FLAML -t flaml-dev
docker run -it flaml-dev
```
### Develop in Remote Container
If you use vscode, you can open the FLAML folder in a [Container](https://code.visualstudio.com/docs/remote/containers).
We have provided the configuration in [devcontainer](https://github.com/microsoft/FLAML/blob/main/.devcontainer).
### Pre-commit
Run `pre-commit install` to install pre-commit into your git hooks. Before you commit, run
`pre-commit run` to check if you meet the pre-commit requirements. If you use Windows (without WSL) and can't commit after installing pre-commit, you can run `pre-commit uninstall` to uninstall the hook. In WSL or Linux this is supposed to work.
### Coverage
Any code you commit should not decrease coverage. To run all unit tests:
```bash
coverage run -m pytest test
```
Then you can see the coverage report by
`coverage report -m` or `coverage html`.
If all the tests are passed, please also test run [notebook/automl_classification](https://github.com/microsoft/FLAML/blob/main/notebook/automl_classification.ipynb) to make sure your commit does not break the notebook example.
### Documentation
To build and test documentation locally, install [Node.js](https://nodejs.org/en/download/).
Then:
```console
npm install --global yarn
pip install pydoc-markdown
cd website
yarn install
pydoc-markdown
yarn start
```
The last command starts a local development server and opens up a browser window.
Most changes are reflected live without having to restart the server.

View File

@@ -0,0 +1,62 @@
# AutoML - Classification
### A basic classification example
```python
from flaml import AutoML
from sklearn.datasets import load_iris
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
"time_budget": 1, # in seconds
"metric": 'accuracy',
"task": 'classification',
"log_file_name": "iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
**automl_settings)
# Predict
print(automl.predict_proba(X_train))
# Print the best model
print(automl.model.estimator)
```
#### Sample of output
```
[flaml.automl: 11-12 18:21:44] {1485} INFO - Data split method: stratified
[flaml.automl: 11-12 18:21:44] {1489} INFO - Evaluation method: cv
[flaml.automl: 11-12 18:21:44] {1540} INFO - Minimizing error metric: 1-accuracy
[flaml.automl: 11-12 18:21:44] {1577} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'catboost', 'xgboost', 'extra_tree', 'lrl1']
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-12 18:21:44] {1944} INFO - Estimated sufficient time budget=1285s. Estimated necessary time budget=23s.
[flaml.automl: 11-12 18:21:44] {2029} INFO - at 0.2s, estimator lgbm's best error=0.0733, best estimator lgbm's best error=0.0733
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-12 18:21:44] {2029} INFO - at 0.3s, estimator lgbm's best error=0.0733, best estimator lgbm's best error=0.0733
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 2, current learner lgbm
[flaml.automl: 11-12 18:21:44] {2029} INFO - at 0.4s, estimator lgbm's best error=0.0533, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 3, current learner lgbm
[flaml.automl: 11-12 18:21:44] {2029} INFO - at 0.6s, estimator lgbm's best error=0.0533, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 4, current learner lgbm
[flaml.automl: 11-12 18:21:44] {2029} INFO - at 0.6s, estimator lgbm's best error=0.0533, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:44] {1826} INFO - iteration 5, current learner xgboost
[flaml.automl: 11-12 18:21:45] {2029} INFO - at 0.9s, estimator xgboost's best error=0.0600, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:45] {1826} INFO - iteration 6, current learner lgbm
[flaml.automl: 11-12 18:21:45] {2029} INFO - at 1.0s, estimator lgbm's best error=0.0533, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:45] {1826} INFO - iteration 7, current learner extra_tree
[flaml.automl: 11-12 18:21:45] {2029} INFO - at 1.1s, estimator extra_tree's best error=0.0667, best estimator lgbm's best error=0.0533
[flaml.automl: 11-12 18:21:45] {2242} INFO - retrain lgbm for 0.0s
[flaml.automl: 11-12 18:21:45] {2247} INFO - retrained model: LGBMClassifier(learning_rate=0.2677050123105203, max_bin=127,
min_child_samples=12, n_estimators=4, num_leaves=4,
reg_alpha=0.001348364934537134, reg_lambda=1.4442580148221913,
verbose=-1)
[flaml.automl: 11-12 18:21:45] {1608} INFO - fit succeeded
[flaml.automl: 11-12 18:21:45] {1610} INFO - Time taken to find the best model: 0.3756711483001709
```
### A more advanced example including custom learner and metric
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/flaml_automl.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/flaml_automl.ipynb)

View File

@@ -0,0 +1,89 @@
# AutoML - NLP
### Requirements
This example requires GPU. Install the [nlp] option:
```python
pip install "flaml[nlp]"
```
### A simple sequence classification example
```python
from flaml import AutoML
from datasets import load_dataset
train_dataset = load_dataset("glue", "mrpc", split="train").to_pandas()
dev_dataset = load_dataset("glue", "mrpc", split="validation").to_pandas()
test_dataset = load_dataset("glue", "mrpc", split="test").to_pandas()
custom_sent_keys = ["sentence1", "sentence2"]
label_key = "label"
X_train, y_train = train_dataset[custom_sent_keys], train_dataset[label_key]
X_val, y_val = dev_dataset[custom_sent_keys], dev_dataset[label_key]
X_test = test_dataset[custom_sent_keys]
automl = AutoML()
automl_settings = {
"time_budget": 100,
"task": "seq-classification",
"custom_hpo_args": {"output_dir": "data/output/"},
"gpu_per_trial": 1, # set to 0 if no GPU is available
}
automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings)
automl.predict(X_test)
```
#### Sample output
```
[flaml.automl: 12-06 08:21:39] {1943} INFO - task = seq-classification
[flaml.automl: 12-06 08:21:39] {1945} INFO - Data split method: stratified
[flaml.automl: 12-06 08:21:39] {1949} INFO - Evaluation method: holdout
[flaml.automl: 12-06 08:21:39] {2019} INFO - Minimizing error metric: 1-accuracy
[flaml.automl: 12-06 08:21:39] {2071} INFO - List of ML learners in AutoML Run: ['transformer']
[flaml.automl: 12-06 08:21:39] {2311} INFO - iteration 0, current learner transformer
{'data/output/train_2021-12-06_08-21-53/train_8947b1b2_1_n=1e-06,s=9223372036854775807,e=1e-05,s=-1,s=0.45765,e=32,d=42,o=0.0,y=0.0_2021-12-06_08-21-53/checkpoint-53': 53}
[flaml.automl: 12-06 08:22:56] {2424} INFO - Estimated sufficient time budget=766860s. Estimated necessary time budget=767s.
[flaml.automl: 12-06 08:22:56] {2499} INFO - at 76.7s, estimator transformer's best error=0.1740, best estimator transformer's best error=0.1740
[flaml.automl: 12-06 08:22:56] {2606} INFO - selected model: <flaml.nlp.huggingface.trainer.TrainerForAuto object at 0x7f49ea8414f0>
[flaml.automl: 12-06 08:22:56] {2100} INFO - fit succeeded
[flaml.automl: 12-06 08:22:56] {2101} INFO - Time taken to find the best model: 76.69802761077881
[flaml.automl: 12-06 08:22:56] {2112} WARNING - Time taken to find the best model is 77% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```
### A simple sequence regression example
```python
from flaml import AutoML
from datasets import load_dataset
train_dataset = (
load_dataset("glue", "stsb", split="train[:1%]").to_pandas().iloc[0:4]
)
dev_dataset = (
load_dataset("glue", "stsb", split="train[1%:2%]").to_pandas().iloc[0:4]
)
custom_sent_keys = ["sentence1", "sentence2"]
label_key = "label"
X_train = train_dataset[custom_sent_keys]
y_train = train_dataset[label_key]
X_val = dev_dataset[custom_sent_keys]
y_val = dev_dataset[label_key]
automl = AutoML()
automl_settings = {
"gpu_per_trial": 0,
"time_budget": 20,
"task": "seq-regression",
"metric": "rmse",
}
automl_settings["custom_hpo_args"] = {
"model_path": "google/electra-small-discriminator",
"output_dir": "data/output/",
"ckpt_per_epoch": 5,
"fp16": False,
}
automl.fit(
X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings
)
```

View File

@@ -0,0 +1,96 @@
# AutoML - Rank
### A simple learning-to-rank example
```python
from sklearn.datasets import fetch_openml
from flaml import AutoML
X_train, y_train = fetch_openml(name="credit-g", return_X_y=True, as_frame=False)
y_train = y_train.cat.codes
# not a real learning to rank dataaset
groups = [200] * 4 + [100] * 2 # group counts
automl = AutoML()
automl.fit(
X_train, y_train, groups=groups,
task='rank', time_budget=10, # in seconds
)
```
#### Sample output
```
[flaml.automl: 11-15 07:14:30] {1485} INFO - Data split method: group
[flaml.automl: 11-15 07:14:30] {1489} INFO - Evaluation method: holdout
[flaml.automl: 11-15 07:14:30] {1540} INFO - Minimizing error metric: 1-ndcg
[flaml.automl: 11-15 07:14:30] {1577} INFO - List of ML learners in AutoML Run: ['lgbm', 'xgboost']
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-15 07:14:30] {1944} INFO - Estimated sufficient time budget=679s. Estimated necessary time budget=1s.
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.1s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.1s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 2, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.2s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 3, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.2s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 4, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.2s, estimator xgboost's best error=0.0315, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 5, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.2s, estimator xgboost's best error=0.0315, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 6, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.3s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 7, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.3s, estimator lgbm's best error=0.0248, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 8, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0315, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 9, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0315, best estimator lgbm's best error=0.0248
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 10, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0233, best estimator xgboost's best error=0.0233
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 11, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0233, best estimator xgboost's best error=0.0233
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 12, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0233, best estimator xgboost's best error=0.0233
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 13, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.4s, estimator xgboost's best error=0.0233, best estimator xgboost's best error=0.0233
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 14, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.5s, estimator lgbm's best error=0.0225, best estimator lgbm's best error=0.0225
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 15, current learner xgboost
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.5s, estimator xgboost's best error=0.0233, best estimator lgbm's best error=0.0225
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 16, current learner lgbm
[flaml.automl: 11-15 07:14:30] {2029} INFO - at 0.5s, estimator lgbm's best error=0.0225, best estimator lgbm's best error=0.0225
[flaml.automl: 11-15 07:14:30] {1826} INFO - iteration 17, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.5s, estimator lgbm's best error=0.0225, best estimator lgbm's best error=0.0225
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 18, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.6s, estimator lgbm's best error=0.0225, best estimator lgbm's best error=0.0225
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 19, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.6s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 20, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.6s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 21, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.7s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 22, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.7s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 23, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.8s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 24, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.8s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 25, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.8s, estimator lgbm's best error=0.0201, best estimator lgbm's best error=0.0201
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 26, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.9s, estimator lgbm's best error=0.0197, best estimator lgbm's best error=0.0197
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 27, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 0.9s, estimator lgbm's best error=0.0197, best estimator lgbm's best error=0.0197
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 28, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 1.0s, estimator lgbm's best error=0.0197, best estimator lgbm's best error=0.0197
[flaml.automl: 11-15 07:14:31] {1826} INFO - iteration 29, current learner lgbm
[flaml.automl: 11-15 07:14:31] {2029} INFO - at 1.0s, estimator lgbm's best error=0.0197, best estimator lgbm's best error=0.0197
[flaml.automl: 11-15 07:14:31] {2242} INFO - retrain lgbm for 0.0s
[flaml.automl: 11-15 07:14:31] {2247} INFO - retrained model: LGBMRanker(colsample_bytree=0.9852774042640857,
learning_rate=0.034918421933217675, max_bin=1023,
min_child_samples=22, n_estimators=6, num_leaves=23,
reg_alpha=0.0009765625, reg_lambda=21.505295697527654, verbose=-1)
[flaml.automl: 11-15 07:14:31] {1608} INFO - fit succeeded
[flaml.automl: 11-15 07:14:31] {1610} INFO - Time taken to find the best model: 0.8846545219421387
[flaml.automl: 11-15 07:14:31] {1624} WARNING - Time taken to find the best model is 88% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```

View File

@@ -0,0 +1,101 @@
# AutoML - Regression
### A basic regression example
```python
from flaml import AutoML
from sklearn.datasets import fetch_california_housing
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
"time_budget": 1, # in seconds
"metric": 'r2',
"task": 'regression',
"log_file_name": "california.log",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
**automl_settings)
# Predict
print(automl.predict(X_train))
# Print the best model
print(automl.model.estimator)
```
#### Sample output
```
[flaml.automl: 11-15 07:08:19] {1485} INFO - Data split method: uniform
[flaml.automl: 11-15 07:08:19] {1489} INFO - Evaluation method: holdout
[flaml.automl: 11-15 07:08:19] {1540} INFO - Minimizing error metric: 1-r2
[flaml.automl: 11-15 07:08:19] {1577} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'catboost', 'xgboost', 'extra_tree']
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-15 07:08:19] {1944} INFO - Estimated sufficient time budget=846s. Estimated necessary time budget=2s.
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.2s, estimator lgbm's best error=0.7393, best estimator lgbm's best error=0.7393
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.3s, estimator lgbm's best error=0.7393, best estimator lgbm's best error=0.7393
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 2, current learner lgbm
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.3s, estimator lgbm's best error=0.5446, best estimator lgbm's best error=0.5446
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 3, current learner lgbm
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.4s, estimator lgbm's best error=0.2807, best estimator lgbm's best error=0.2807
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 4, current learner lgbm
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.5s, estimator lgbm's best error=0.2712, best estimator lgbm's best error=0.2712
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 5, current learner lgbm
[flaml.automl: 11-15 07:08:19] {2029} INFO - at 0.5s, estimator lgbm's best error=0.2712, best estimator lgbm's best error=0.2712
[flaml.automl: 11-15 07:08:19] {1826} INFO - iteration 6, current learner lgbm
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.6s, estimator lgbm's best error=0.2712, best estimator lgbm's best error=0.2712
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 7, current learner lgbm
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.7s, estimator lgbm's best error=0.2197, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 8, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.8s, estimator xgboost's best error=1.4958, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 9, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.8s, estimator xgboost's best error=1.4958, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 10, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.9s, estimator xgboost's best error=0.7052, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 11, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.9s, estimator xgboost's best error=0.3619, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 12, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 0.9s, estimator xgboost's best error=0.3619, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 13, current learner xgboost
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 1.0s, estimator xgboost's best error=0.3619, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {1826} INFO - iteration 14, current learner extra_tree
[flaml.automl: 11-15 07:08:20] {2029} INFO - at 1.1s, estimator extra_tree's best error=0.7197, best estimator lgbm's best error=0.2197
[flaml.automl: 11-15 07:08:20] {2242} INFO - retrain lgbm for 0.0s
[flaml.automl: 11-15 07:08:20] {2247} INFO - retrained model: LGBMRegressor(colsample_bytree=0.7610534336273627,
learning_rate=0.41929025492645006, max_bin=255,
min_child_samples=4, n_estimators=45, num_leaves=4,
reg_alpha=0.0009765625, reg_lambda=0.009280655005879943,
verbose=-1)
[flaml.automl: 11-15 07:08:20] {1608} INFO - fit succeeded
[flaml.automl: 11-15 07:08:20] {1610} INFO - Time taken to find the best model: 0.7289648056030273
[flaml.automl: 11-15 07:08:20] {1624} WARNING - Time taken to find the best model is 73% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```
### Multi-output regression
We can combine `sklearn.MultiOutputRegressor` and `flaml.AutoML` to do AutoML for multi-output regression.
```python
from flaml import AutoML
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor
# create regression data
X, y = make_regression(n_targets=3)
# split into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)
# train the model
model = MultiOutputRegressor(AutoML(task="regression", time_budget=60))
model.fit(X_train, y_train)
# predict
print(model.predict(X_test))
```
It will perform AutoML for each target, each taking 60 seconds.

View File

@@ -0,0 +1,203 @@
# AutoML - Time Series Forecast
### Prerequisites
Install the [ts_forecast] option.
```bash
pip install "flaml[ts_forecast]"
```
### Univariate time series
```python
import numpy as np
from flaml import AutoML
X_train = np.arange('2014-01', '2021-01', dtype='datetime64[M]')
y_train = np.random.random(size=72)
automl = AutoML()
automl.fit(X_train=X_train[:72], # a single column of timestamp
y_train=y_train, # value for each timestamp
period=12, # time horizon to forecast, e.g., 12 months
task='ts_forecast', time_budget=15, # time budget in seconds
log_file_name="ts_forecast.log",
)
print(automl.predict(X_train[72:]))
```
#### Sample output
```
[flaml.automl: 11-15 18:44:49] {1485} INFO - Data split method: time
INFO:flaml.automl:Data split method: time
[flaml.automl: 11-15 18:44:49] {1489} INFO - Evaluation method: cv
INFO:flaml.automl:Evaluation method: cv
[flaml.automl: 11-15 18:44:49] {1540} INFO - Minimizing error metric: mape
INFO:flaml.automl:Minimizing error metric: mape
[flaml.automl: 11-15 18:44:49] {1577} INFO - List of ML learners in AutoML Run: ['prophet', 'arima', 'sarimax']
INFO:flaml.automl:List of ML learners in AutoML Run: ['prophet', 'arima', 'sarimax']
[flaml.automl: 11-15 18:44:49] {1826} INFO - iteration 0, current learner prophet
INFO:flaml.automl:iteration 0, current learner prophet
[flaml.automl: 11-15 18:45:00] {1944} INFO - Estimated sufficient time budget=104159s. Estimated necessary time budget=104s.
INFO:flaml.automl:Estimated sufficient time budget=104159s. Estimated necessary time budget=104s.
[flaml.automl: 11-15 18:45:00] {2029} INFO - at 10.5s, estimator prophet's best error=1.5681, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 10.5s, estimator prophet's best error=1.5681, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:00] {1826} INFO - iteration 1, current learner arima
INFO:flaml.automl:iteration 1, current learner arima
[flaml.automl: 11-15 18:45:00] {2029} INFO - at 10.7s, estimator arima's best error=2.3515, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 10.7s, estimator arima's best error=2.3515, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:00] {1826} INFO - iteration 2, current learner arima
INFO:flaml.automl:iteration 2, current learner arima
[flaml.automl: 11-15 18:45:01] {2029} INFO - at 11.5s, estimator arima's best error=2.1774, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 11.5s, estimator arima's best error=2.1774, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:01] {1826} INFO - iteration 3, current learner arima
INFO:flaml.automl:iteration 3, current learner arima
[flaml.automl: 11-15 18:45:01] {2029} INFO - at 11.9s, estimator arima's best error=2.1774, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 11.9s, estimator arima's best error=2.1774, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:01] {1826} INFO - iteration 4, current learner arima
INFO:flaml.automl:iteration 4, current learner arima
[flaml.automl: 11-15 18:45:02] {2029} INFO - at 12.9s, estimator arima's best error=1.8560, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 12.9s, estimator arima's best error=1.8560, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:02] {1826} INFO - iteration 5, current learner arima
INFO:flaml.automl:iteration 5, current learner arima
[flaml.automl: 11-15 18:45:04] {2029} INFO - at 14.4s, estimator arima's best error=1.8560, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 14.4s, estimator arima's best error=1.8560, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:04] {1826} INFO - iteration 6, current learner sarimax
INFO:flaml.automl:iteration 6, current learner sarimax
[flaml.automl: 11-15 18:45:04] {2029} INFO - at 14.7s, estimator sarimax's best error=2.3515, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 14.7s, estimator sarimax's best error=2.3515, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:04] {1826} INFO - iteration 7, current learner sarimax
INFO:flaml.automl:iteration 7, current learner sarimax
[flaml.automl: 11-15 18:45:04] {2029} INFO - at 15.0s, estimator sarimax's best error=1.6371, best estimator prophet's best error=1.5681
INFO:flaml.automl: at 15.0s, estimator sarimax's best error=1.6371, best estimator prophet's best error=1.5681
[flaml.automl: 11-15 18:45:05] {2242} INFO - retrain prophet for 0.5s
INFO:flaml.automl:retrain prophet for 0.5s
[flaml.automl: 11-15 18:45:05] {2247} INFO - retrained model: <prophet.forecaster.Prophet object at 0x7f042ba1da50>
INFO:flaml.automl:retrained model: <prophet.forecaster.Prophet object at 0x7f042ba1da50>
[flaml.automl: 11-15 18:45:05] {1608} INFO - fit succeeded
INFO:flaml.automl:fit succeeded
[flaml.automl: 11-15 18:45:05] {1610} INFO - Time taken to find the best model: 10.450132608413696
INFO:flaml.automl:Time taken to find the best model: 10.450132608413696
0 0.384715
1 0.191349
2 0.372324
3 0.814549
4 0.269616
5 0.470667
6 0.603665
7 0.256773
8 0.408787
9 0.663065
10 0.619943
11 0.090284
Name: yhat, dtype: float64
```
### Multivariate time series
```python
import statsmodels.api as sm
data = sm.datasets.co2.load_pandas().data
# data is given in weeks, but the task is to predict monthly, so use monthly averages instead
data = data['co2'].resample('MS').mean()
data = data.fillna(data.bfill()) # makes sure there are no missing values
data = data.to_frame().reset_index()
num_samples = data.shape[0]
time_horizon = 12
split_idx = num_samples - time_horizon
train_df = data[:split_idx] # train_df is a dataframe with two columns: timestamp and label
X_test = data[split_idx:]['index'].to_frame() # X_test is a dataframe with dates for prediction
y_test = data[split_idx:]['co2'] # y_test is a series of the values corresponding to the dates for prediction
from flaml import AutoML
automl = AutoML()
settings = {
"time_budget": 10, # total running time in seconds
"metric": 'mape', # primary metric for validation: 'mape' is generally used for forecast tasks
"task": 'ts_forecast', # task type
"log_file_name": 'CO2_forecast.log', # flaml log file
"eval_method": "holdout", # validation method can be chosen from ['auto', 'holdout', 'cv']
"seed": 7654321, # random seed
}
automl.fit(dataframe=train_df, # training data
label='co2', # label column
period=time_horizon, # key word argument 'period' must be included for forecast task)
**settings)
```
#### Sample output
```
[flaml.automl: 11-15 18:54:12] {1485} INFO - Data split method: time
INFO:flaml.automl:Data split method: time
[flaml.automl: 11-15 18:54:12] {1489} INFO - Evaluation method: holdout
INFO:flaml.automl:Evaluation method: holdout
[flaml.automl: 11-15 18:54:13] {1540} INFO - Minimizing error metric: mape
INFO:flaml.automl:Minimizing error metric: mape
[flaml.automl: 11-15 18:54:13] {1577} INFO - List of ML learners in AutoML Run: ['prophet', 'arima', 'sarimax']
INFO:flaml.automl:List of ML learners in AutoML Run: ['prophet', 'arima', 'sarimax']
[flaml.automl: 11-15 18:54:13] {1826} INFO - iteration 0, current learner prophet
INFO:flaml.automl:iteration 0, current learner prophet
[flaml.automl: 11-15 18:54:15] {1944} INFO - Estimated sufficient time budget=25297s. Estimated necessary time budget=25s.
INFO:flaml.automl:Estimated sufficient time budget=25297s. Estimated necessary time budget=25s.
[flaml.automl: 11-15 18:54:15] {2029} INFO - at 2.6s, estimator prophet's best error=0.0008, best estimator prophet's best error=0.0008
INFO:flaml.automl: at 2.6s, estimator prophet's best error=0.0008, best estimator prophet's best error=0.0008
[flaml.automl: 11-15 18:54:15] {1826} INFO - iteration 1, current learner prophet
INFO:flaml.automl:iteration 1, current learner prophet
[flaml.automl: 11-15 18:54:18] {2029} INFO - at 5.2s, estimator prophet's best error=0.0008, best estimator prophet's best error=0.0008
INFO:flaml.automl: at 5.2s, estimator prophet's best error=0.0008, best estimator prophet's best error=0.0008
[flaml.automl: 11-15 18:54:18] {1826} INFO - iteration 2, current learner arima
INFO:flaml.automl:iteration 2, current learner arima
[flaml.automl: 11-15 18:54:18] {2029} INFO - at 5.5s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0008
INFO:flaml.automl: at 5.5s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0008
[flaml.automl: 11-15 18:54:18] {1826} INFO - iteration 3, current learner arima
INFO:flaml.automl:iteration 3, current learner arima
[flaml.automl: 11-15 18:54:18] {2029} INFO - at 5.6s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0008
INFO:flaml.automl: at 5.6s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0008
[flaml.automl: 11-15 18:54:18] {1826} INFO - iteration 4, current learner prophet
INFO:flaml.automl:iteration 4, current learner prophet
[flaml.automl: 11-15 18:54:21] {2029} INFO - at 8.1s, estimator prophet's best error=0.0005, best estimator prophet's best error=0.0005
INFO:flaml.automl: at 8.1s, estimator prophet's best error=0.0005, best estimator prophet's best error=0.0005
[flaml.automl: 11-15 18:54:21] {1826} INFO - iteration 5, current learner arima
INFO:flaml.automl:iteration 5, current learner arima
[flaml.automl: 11-15 18:54:21] {2029} INFO - at 8.9s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0005
INFO:flaml.automl: at 8.9s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0005
[flaml.automl: 11-15 18:54:21] {1826} INFO - iteration 6, current learner arima
INFO:flaml.automl:iteration 6, current learner arima
[flaml.automl: 11-15 18:54:22] {2029} INFO - at 9.7s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0005
INFO:flaml.automl: at 9.7s, estimator arima's best error=0.0047, best estimator prophet's best error=0.0005
[flaml.automl: 11-15 18:54:22] {1826} INFO - iteration 7, current learner sarimax
INFO:flaml.automl:iteration 7, current learner sarimax
[flaml.automl: 11-15 18:54:23] {2029} INFO - at 10.1s, estimator sarimax's best error=0.0047, best estimator prophet's best error=0.0005
INFO:flaml.automl: at 10.1s, estimator sarimax's best error=0.0047, best estimator prophet's best error=0.0005
[flaml.automl: 11-15 18:54:23] {2242} INFO - retrain prophet for 0.9s
INFO:flaml.automl:retrain prophet for 0.9s
[flaml.automl: 11-15 18:54:23] {2247} INFO - retrained model: <prophet.forecaster.Prophet object at 0x7f0418e21f50>
INFO:flaml.automl:retrained model: <prophet.forecaster.Prophet object at 0x7f0418e21f50>
[flaml.automl: 11-15 18:54:23] {1608} INFO - fit succeeded
INFO:flaml.automl:fit succeeded
[flaml.automl: 11-15 18:54:23] {1610} INFO - Time taken to find the best model: 8.118467330932617
INFO:flaml.automl:Time taken to find the best model: 8.118467330932617
[flaml.automl: 11-15 18:54:23] {1624} WARNING - Time taken to find the best model is 81% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
WARNING:flaml.automl:Time taken to find the best model is 81% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```
#### Compute and plot predictions
```python
flaml_y_pred = automl.predict(X_test)
import matplotlib.pyplot as plt
plt.plot(X_test, y_test, label='Actual level')
plt.plot(X_test, flaml_y_pred, label='FLAML forecast')
plt.xlabel('Date')
plt.ylabel('CO2 Levels')
plt.legend()
```
![png](images/CO2.png)
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/automl_time_series_forecast.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/automl_time_series_forecast.ipynb)

View File

@@ -0,0 +1,195 @@
# AutoML for LightGBM
### Use built-in LGBMEstimator
```python
from flaml import AutoML
from flaml.data import load_openml_dataset
# Download [houses dataset](https://www.openml.org/d/537) from OpenML. The task is to predict median price of the house in the region based on demographic composition and a state of housing market in the region.
X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=537, data_dir='./')
automl = AutoML()
settings = {
"time_budget": 60, # total running time in seconds
"metric": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']
"estimator_list": ['lgbm'], # list of ML learners; we tune lightgbm in this example
"task": 'regression', # task type
"log_file_name": 'houses_experiment.log', # flaml log file
"seed": 7654321, # random seed
}
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
#### Sample output
```
[flaml.automl: 11-15 19:46:44] {1485} INFO - Data split method: uniform
[flaml.automl: 11-15 19:46:44] {1489} INFO - Evaluation method: cv
[flaml.automl: 11-15 19:46:44] {1540} INFO - Minimizing error metric: 1-r2
[flaml.automl: 11-15 19:46:44] {1577} INFO - List of ML learners in AutoML Run: ['lgbm']
[flaml.automl: 11-15 19:46:44] {1826} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-15 19:46:44] {1944} INFO - Estimated sufficient time budget=3232s. Estimated necessary time budget=3s.
[flaml.automl: 11-15 19:46:44] {2029} INFO - at 0.5s, estimator lgbm's best error=0.7383, best estimator lgbm's best error=0.7383
[flaml.automl: 11-15 19:46:44] {1826} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-15 19:46:44] {2029} INFO - at 0.6s, estimator lgbm's best error=0.4774, best estimator lgbm's best error=0.4774
[flaml.automl: 11-15 19:46:44] {1826} INFO - iteration 2, current learner lgbm
[flaml.automl: 11-15 19:46:44] {2029} INFO - at 0.7s, estimator lgbm's best error=0.4774, best estimator lgbm's best error=0.4774
[flaml.automl: 11-15 19:46:44] {1826} INFO - iteration 3, current learner lgbm
[flaml.automl: 11-15 19:46:44] {2029} INFO - at 0.9s, estimator lgbm's best error=0.2985, best estimator lgbm's best error=0.2985
[flaml.automl: 11-15 19:46:44] {1826} INFO - iteration 4, current learner lgbm
[flaml.automl: 11-15 19:46:45] {2029} INFO - at 1.3s, estimator lgbm's best error=0.2337, best estimator lgbm's best error=0.2337
[flaml.automl: 11-15 19:46:45] {1826} INFO - iteration 5, current learner lgbm
[flaml.automl: 11-15 19:46:45] {2029} INFO - at 1.4s, estimator lgbm's best error=0.2337, best estimator lgbm's best error=0.2337
[flaml.automl: 11-15 19:46:45] {1826} INFO - iteration 6, current learner lgbm
[flaml.automl: 11-15 19:46:46] {2029} INFO - at 2.5s, estimator lgbm's best error=0.2219, best estimator lgbm's best error=0.2219
[flaml.automl: 11-15 19:46:46] {1826} INFO - iteration 7, current learner lgbm
[flaml.automl: 11-15 19:46:46] {2029} INFO - at 2.9s, estimator lgbm's best error=0.2219, best estimator lgbm's best error=0.2219
[flaml.automl: 11-15 19:46:46] {1826} INFO - iteration 8, current learner lgbm
[flaml.automl: 11-15 19:46:48] {2029} INFO - at 4.5s, estimator lgbm's best error=0.1764, best estimator lgbm's best error=0.1764
[flaml.automl: 11-15 19:46:48] {1826} INFO - iteration 9, current learner lgbm
[flaml.automl: 11-15 19:46:54] {2029} INFO - at 10.5s, estimator lgbm's best error=0.1630, best estimator lgbm's best error=0.1630
[flaml.automl: 11-15 19:46:54] {1826} INFO - iteration 10, current learner lgbm
[flaml.automl: 11-15 19:46:56] {2029} INFO - at 12.4s, estimator lgbm's best error=0.1630, best estimator lgbm's best error=0.1630
[flaml.automl: 11-15 19:46:56] {1826} INFO - iteration 11, current learner lgbm
[flaml.automl: 11-15 19:47:13] {2029} INFO - at 29.0s, estimator lgbm's best error=0.1630, best estimator lgbm's best error=0.1630
[flaml.automl: 11-15 19:47:13] {1826} INFO - iteration 12, current learner lgbm
[flaml.automl: 11-15 19:47:15] {2029} INFO - at 31.1s, estimator lgbm's best error=0.1630, best estimator lgbm's best error=0.1630
[flaml.automl: 11-15 19:47:15] {1826} INFO - iteration 13, current learner lgbm
[flaml.automl: 11-15 19:47:29] {2029} INFO - at 45.8s, estimator lgbm's best error=0.1564, best estimator lgbm's best error=0.1564
[flaml.automl: 11-15 19:47:33] {2242} INFO - retrain lgbm for 3.2s
[flaml.automl: 11-15 19:47:33] {2247} INFO - retrained model: LGBMRegressor(colsample_bytree=0.8025848209352517,
learning_rate=0.09100963138990374, max_bin=255,
min_child_samples=42, n_estimators=363, num_leaves=216,
reg_alpha=0.001113000336715291, reg_lambda=76.50614276906414,
verbose=-1)
[flaml.automl: 11-15 19:47:33] {1608} INFO - fit succeeded
[flaml.automl: 11-15 19:47:33] {1610} INFO - Time taken to find the best model: 45.75616669654846
[flaml.automl: 11-15 19:47:33] {1624} WARNING - Time taken to find the best model is 76% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```
#### Retrieve best config
```python
print('Best hyperparmeter config:', automl.best_config)
print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))
print(automl.model.estimator)
# Best hyperparmeter config: {'n_estimators': 363, 'num_leaves': 216, 'min_child_samples': 42, 'learning_rate': 0.09100963138990374, 'log_max_bin': 8, 'colsample_bytree': 0.8025848209352517, 'reg_alpha': 0.001113000336715291, 'reg_lambda': 76.50614276906414}
# Best r2 on validation data: 0.8436
# Training duration of best run: 3.229 s
# LGBMRegressor(colsample_bytree=0.8025848209352517,
# learning_rate=0.09100963138990374, max_bin=255,
# min_child_samples=42, n_estimators=363, num_leaves=216,
# reg_alpha=0.001113000336715291, reg_lambda=76.50614276906414,
# verbose=-1)
```
#### Plot feature importance
```python
import matplotlib.pyplot as plt
plt.barh(automl.model.estimator.feature_name_, automl.model.estimator.feature_importances_)
```
![png](../Use-Cases/images/feature_importance.png)
#### Compute predictions of testing dataset
```python
y_pred = automl.predict(X_test)
print('Predicted labels', y_pred)
# Predicted labels [143391.65036562 245535.13731811 153171.44071629 ... 184354.52735963
# 235510.49470445 282617.22858956]
```
#### Compute different metric values on testing dataset
```python
from flaml.ml import sklearn_metric_loss_score
print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))
print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))
print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))
# r2 = 0.8505434326526395
# mse = 1975592613.138005
# mae = 29471.536046068788
```
#### Compare with untuned LightGBM
```python
from lightgbm import LGBMRegressor
lgbm = LGBMRegressor()
lgbm.fit(X_train, y_train)
y_pred = lgbm.predict(X_test)
from flaml.ml import sklearn_metric_loss_score
print('default lgbm r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))
# default lgbm r2 = 0.8296179648694404
```
#### Plot learning curve
How does the model accuracy improve as we search for different hyperparameter configurations?
```python
from flaml.data import get_output_from_log
import numpy as np
time_history, best_valid_loss_history, valid_loss_history, config_history, metric_history = \
get_output_from_log(filename=settings['log_file_name'], time_budget=60)
plt.title('Learning Curve')
plt.xlabel('Wall Clock Time (s)')
plt.ylabel('Validation r2')
plt.step(time_history, 1 - np.array(best_valid_loss_history), where='post')
plt.show()
```
![png](images/lgbm_curve.png)
### Use a customized LightGBM learner
The native API of LightGBM allows one to specify a custom objective function in the model constructor. You can easily enable it by adding a customized LightGBM learner in FLAML. In the following example, we show how to add such a customized LightGBM learner with a custom objective function.
#### Create a customized LightGBM learner with a custom objective function
```python
import numpy as np
# define your customized objective function
def my_loss_obj(y_true, y_pred):
c = 0.5
residual = y_pred - y_true
grad = c * residual /(np.abs(residual) + c)
hess = c ** 2 / (np.abs(residual) + c) ** 2
# rmse grad and hess
grad_rmse = residual
hess_rmse = 1.0
# mae grad and hess
grad_mae = np.array(residual)
grad_mae[grad_mae > 0] = 1.
grad_mae[grad_mae <= 0] = -1.
hess_mae = 1.0
coef = [0.4, 0.3, 0.3]
return coef[0] * grad + coef[1] * grad_rmse + coef[2] * grad_mae, \
coef[0] * hess + coef[1] * hess_rmse + coef[2] * hess_mae
from flaml.model import LGBMEstimator
class MyLGBM(LGBMEstimator):
"""LGBMEstimator with my_loss_obj as the objective function"""
def __init__(self, **config):
super().__init__(objective=my_loss_obj, **config)
```
#### Add the customized learner and tune it
```python
automl = AutoML()
automl.add_learner(learner_name='my_lgbm', learner_class=MyLGBM)
settings["estimator_list"] = ['my_lgbm'] # change the estimator list
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/automl_lightgbm.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/automl_lightgbm.ipynb)

View File

@@ -0,0 +1,219 @@
# AutoML for XGBoost
### Use built-in XGBoostSklearnEstimator
```python
from flaml import AutoML
from flaml.data import load_openml_dataset
# Download [houses dataset](https://www.openml.org/d/537) from OpenML. The task is to predict median price of the house in the region based on demographic composition and a state of housing market in the region.
X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=537, data_dir='./')
automl = AutoML()
settings = {
"time_budget": 60, # total running time in seconds
"metric": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']
"estimator_list": ['xgboost'], # list of ML learners; we tune lightgbm in this example
"task": 'regression', # task type
"log_file_name": 'houses_experiment.log', # flaml log file
"seed": 7654321, # random seed
}
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
#### Sample output
```
[flaml.automl: 09-29 23:06:46] {1446} INFO - Data split method: uniform
[flaml.automl: 09-29 23:06:46] {1450} INFO - Evaluation method: cv
[flaml.automl: 09-29 23:06:46] {1496} INFO - Minimizing error metric: 1-r2
[flaml.automl: 09-29 23:06:46] {1533} INFO - List of ML learners in AutoML Run: ['xgboost']
[flaml.automl: 09-29 23:06:46] {1763} INFO - iteration 0, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1880} INFO - Estimated sufficient time budget=2621s. Estimated necessary time budget=3s.
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 0.3s, estimator xgboost's best error=2.1267, best estimator xgboost's best error=2.1267
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 1, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 0.5s, estimator xgboost's best error=2.1267, best estimator xgboost's best error=2.1267
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 2, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 0.6s, estimator xgboost's best error=0.8485, best estimator xgboost's best error=0.8485
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 3, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 0.8s, estimator xgboost's best error=0.3799, best estimator xgboost's best error=0.3799
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 4, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 1.0s, estimator xgboost's best error=0.3799, best estimator xgboost's best error=0.3799
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 5, current learner xgboost
[flaml.automl: 09-29 23:06:47] {1952} INFO - at 1.2s, estimator xgboost's best error=0.3799, best estimator xgboost's best error=0.3799
[flaml.automl: 09-29 23:06:47] {1763} INFO - iteration 6, current learner xgboost
[flaml.automl: 09-29 23:06:48] {1952} INFO - at 1.5s, estimator xgboost's best error=0.2992, best estimator xgboost's best error=0.2992
[flaml.automl: 09-29 23:06:48] {1763} INFO - iteration 7, current learner xgboost
[flaml.automl: 09-29 23:06:48] {1952} INFO - at 1.9s, estimator xgboost's best error=0.2992, best estimator xgboost's best error=0.2992
[flaml.automl: 09-29 23:06:48] {1763} INFO - iteration 8, current learner xgboost
[flaml.automl: 09-29 23:06:49] {1952} INFO - at 2.2s, estimator xgboost's best error=0.2992, best estimator xgboost's best error=0.2992
[flaml.automl: 09-29 23:06:49] {1763} INFO - iteration 9, current learner xgboost
[flaml.automl: 09-29 23:06:49] {1952} INFO - at 2.5s, estimator xgboost's best error=0.2513, best estimator xgboost's best error=0.2513
[flaml.automl: 09-29 23:06:49] {1763} INFO - iteration 10, current learner xgboost
[flaml.automl: 09-29 23:06:49] {1952} INFO - at 2.8s, estimator xgboost's best error=0.2513, best estimator xgboost's best error=0.2513
[flaml.automl: 09-29 23:06:49] {1763} INFO - iteration 11, current learner xgboost
[flaml.automl: 09-29 23:06:49] {1952} INFO - at 3.0s, estimator xgboost's best error=0.2513, best estimator xgboost's best error=0.2513
[flaml.automl: 09-29 23:06:49] {1763} INFO - iteration 12, current learner xgboost
[flaml.automl: 09-29 23:06:50] {1952} INFO - at 3.3s, estimator xgboost's best error=0.2113, best estimator xgboost's best error=0.2113
[flaml.automl: 09-29 23:06:50] {1763} INFO - iteration 13, current learner xgboost
[flaml.automl: 09-29 23:06:50] {1952} INFO - at 3.5s, estimator xgboost's best error=0.2113, best estimator xgboost's best error=0.2113
[flaml.automl: 09-29 23:06:50] {1763} INFO - iteration 14, current learner xgboost
[flaml.automl: 09-29 23:06:50] {1952} INFO - at 4.0s, estimator xgboost's best error=0.2090, best estimator xgboost's best error=0.2090
[flaml.automl: 09-29 23:06:50] {1763} INFO - iteration 15, current learner xgboost
[flaml.automl: 09-29 23:06:51] {1952} INFO - at 4.5s, estimator xgboost's best error=0.2090, best estimator xgboost's best error=0.2090
[flaml.automl: 09-29 23:06:51] {1763} INFO - iteration 16, current learner xgboost
[flaml.automl: 09-29 23:06:51] {1952} INFO - at 5.2s, estimator xgboost's best error=0.1919, best estimator xgboost's best error=0.1919
[flaml.automl: 09-29 23:06:51] {1763} INFO - iteration 17, current learner xgboost
[flaml.automl: 09-29 23:06:52] {1952} INFO - at 5.5s, estimator xgboost's best error=0.1919, best estimator xgboost's best error=0.1919
[flaml.automl: 09-29 23:06:52] {1763} INFO - iteration 18, current learner xgboost
[flaml.automl: 09-29 23:06:54] {1952} INFO - at 8.0s, estimator xgboost's best error=0.1797, best estimator xgboost's best error=0.1797
[flaml.automl: 09-29 23:06:54] {1763} INFO - iteration 19, current learner xgboost
[flaml.automl: 09-29 23:06:55] {1952} INFO - at 9.0s, estimator xgboost's best error=0.1797, best estimator xgboost's best error=0.1797
[flaml.automl: 09-29 23:06:55] {1763} INFO - iteration 20, current learner xgboost
[flaml.automl: 09-29 23:07:08] {1952} INFO - at 21.8s, estimator xgboost's best error=0.1797, best estimator xgboost's best error=0.1797
[flaml.automl: 09-29 23:07:08] {1763} INFO - iteration 21, current learner xgboost
[flaml.automl: 09-29 23:07:11] {1952} INFO - at 24.4s, estimator xgboost's best error=0.1797, best estimator xgboost's best error=0.1797
[flaml.automl: 09-29 23:07:11] {1763} INFO - iteration 22, current learner xgboost
[flaml.automl: 09-29 23:07:16] {1952} INFO - at 30.0s, estimator xgboost's best error=0.1782, best estimator xgboost's best error=0.1782
[flaml.automl: 09-29 23:07:16] {1763} INFO - iteration 23, current learner xgboost
[flaml.automl: 09-29 23:07:20] {1952} INFO - at 33.5s, estimator xgboost's best error=0.1782, best estimator xgboost's best error=0.1782
[flaml.automl: 09-29 23:07:20] {1763} INFO - iteration 24, current learner xgboost
[flaml.automl: 09-29 23:07:29] {1952} INFO - at 42.3s, estimator xgboost's best error=0.1782, best estimator xgboost's best error=0.1782
[flaml.automl: 09-29 23:07:29] {1763} INFO - iteration 25, current learner xgboost
[flaml.automl: 09-29 23:07:30] {1952} INFO - at 43.2s, estimator xgboost's best error=0.1782, best estimator xgboost's best error=0.1782
[flaml.automl: 09-29 23:07:30] {1763} INFO - iteration 26, current learner xgboost
[flaml.automl: 09-29 23:07:50] {1952} INFO - at 63.4s, estimator xgboost's best error=0.1663, best estimator xgboost's best error=0.1663
[flaml.automl: 09-29 23:07:50] {2059} INFO - selected model: <xgboost.core.Booster object at 0x7f6399005910>
[flaml.automl: 09-29 23:07:55] {2122} INFO - retrain xgboost for 5.4s
[flaml.automl: 09-29 23:07:55] {2128} INFO - retrained model: <xgboost.core.Booster object at 0x7f6398fc0eb0>
[flaml.automl: 09-29 23:07:55] {1557} INFO - fit succeeded
[flaml.automl: 09-29 23:07:55] {1558} INFO - Time taken to find the best model: 63.427649974823
[flaml.automl: 09-29 23:07:55] {1569} WARNING - Time taken to find the best model is 106% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
```
#### Retrieve best config
```python
print('Best hyperparmeter config:', automl.best_config)
print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))
print(automl.model.estimator)
# Best hyperparmeter config: {'n_estimators': 473, 'max_leaves': 35, 'max_depth': 0, 'min_child_weight': 0.001, 'learning_rate': 0.26865031351923346, 'subsample': 0.9718245679598786, 'colsample_bylevel': 0.7421362469066445, 'colsample_bytree': 1.0, 'reg_alpha': 0.06824336834995245, 'reg_lambda': 250.9654222583276}
# Best r2 on validation data: 0.8384
# Training duration of best run: 2.194 s
# XGBRegressor(base_score=0.5, booster='gbtree',
# colsample_bylevel=0.7421362469066445, colsample_bynode=1,
# colsample_bytree=1.0, gamma=0, gpu_id=-1, grow_policy='lossguide',
# importance_type='gain', interaction_constraints='',
# learning_rate=0.26865031351923346, max_delta_step=0, max_depth=0,
# max_leaves=35, min_child_weight=0.001, missing=nan,
# monotone_constraints='()', n_estimators=473, n_jobs=-1,
# num_parallel_tree=1, random_state=0, reg_alpha=0.06824336834995245,
# reg_lambda=250.9654222583276, scale_pos_weight=1,
# subsample=0.9718245679598786, tree_method='hist',
# use_label_encoder=False, validate_parameters=1, verbosity=0)
```
#### Plot feature importance
```python
import matplotlib.pyplot as plt
plt.barh(X_train.columns, automl.model.estimator.feature_importances_)
```
![png](images/xgb_feature_importance.png)
#### Compute predictions of testing dataset
```python
y_pred = automl.predict(X_test)
print('Predicted labels', y_pred)
# Predicted labels [139062.95 237622. 140522.03 ... 182125.5 252156.36 264884.5 ]
```
#### Compute different metric values on testing dataset
```python
from flaml.ml import sklearn_metric_loss_score
print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))
print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))
print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))
# r2 = 0.8456494234135888
# mse = 2040284106.2781258
# mae = 30212.830996680445
```
#### Compare with untuned XGBoost
```python
from xgboost import XGBRegressor
xgb = XGBRegressor()
xgb.fit(X_train, y_train)
y_pred = xgb.predict(X_test)
from flaml.ml import sklearn_metric_loss_score
print('default xgboost r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))
# default xgboost r2 = 0.8265451174596482
```
#### Plot learning curve
How does the model accuracy improve as we search for different hyperparameter configurations?
```python
from flaml.data import get_output_from_log
import numpy as np
time_history, best_valid_loss_history, valid_loss_history, config_history, metric_history = \
get_output_from_log(filename=settings['log_file_name'], time_budget=60)
plt.title('Learning Curve')
plt.xlabel('Wall Clock Time (s)')
plt.ylabel('Validation r2')
plt.step(time_history, 1 - np.array(best_valid_loss_history), where='post')
plt.show()
```
![png](images/xgb_curve.png)
### Use a customized XGBoost learner
You can easily enable a custom objective function by adding a customized XGBoost learner (inherit XGBoostEstimator or XGBoostSklearnEstimator) in FLAML. In the following example, we show how to add such a customized XGBoost learner with a custom objective function.
```python
import numpy as np
# define your customized objective function
def logregobj(preds, dtrain):
labels = dtrain.get_label()
preds = 1.0 / (1.0 + np.exp(-preds)) # transform raw leaf weight
grad = preds - labels
hess = preds * (1.0 - preds)
return grad, hess
from flaml.model import XGBoostEstimator
class MyXGB1(XGBoostEstimator):
'''XGBoostEstimator with the logregobj function as the objective function
'''
def __init__(self, **config):
super().__init__(objective=logregobj, **config)
class MyXGB2(XGBoostEstimator):
'''XGBoostEstimator with 'reg:squarederror' as the objective function
'''
def __init__(self, **config):
super().__init__(objective='reg:gamma', **config)
```
#### Add the customized learners and tune them
```python
automl = AutoML()
automl.add_learner(learner_name='my_xgb1', learner_class=MyXGB1)
automl.add_learner(learner_name='my_xgb2', learner_class=MyXGB2)
settings["estimator_list"] = ['my_xgb1', 'my_xgb2'] # change the estimator list
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/automl_xgboost.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/automl_xgboost.ipynb)

View File

@@ -0,0 +1,51 @@
FLAML can be used together with AzureML and mlflow.
### Prerequisites
Install the [azureml] option.
```bash
pip install "flaml[azureml]"
```
Setup a AzureML workspace:
```python
from azureml.core import Workspace
ws = Workspace.create(name='myworkspace', subscription_id='<azure-subscription-id>',resource_group='myresourcegroup')
```
### Enable mlflow in AzureML workspace
```python
import mlflow
from azureml.core import Workspace
ws = Workspace.from_config()
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
```
### Start an AutoML run
```python
from flaml.data import load_openml_dataset
# Download [Airlines dataset](https://www.openml.org/d/1169) from OpenML. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure.
X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=1169, data_dir="./")
from flaml import AutoML
automl = AutoML()
settings = {
"time_budget": 60, # total running time in seconds
"metric": "accuracy", # metric to optimize
"task": "classification", # task type
"log_file_name": "airlines_experiment.log", # flaml log file
}
mlflow.set_experiment("flaml") # the experiment name in AzureML workspace
with mlflow.start_run() as run: # create a mlflow run
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
The metrics in the run will be automatically logged in an experiment named "flaml" in your AzureML workspace.
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/integrate_azureml.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/integrate_azureml.ipynb)

View File

@@ -0,0 +1,63 @@
As FLAML's AutoML module can be used a transformer in the Sklearn's pipeline we can get all the benefits of pipeline.
### Load data
```python
from flaml.data import load_openml_dataset
# Download [Airlines dataset](https://www.openml.org/d/1169) from OpenML. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure.
X_train, X_test, y_train, y_test = load_openml_dataset(
dataset_id=1169, data_dir='./', random_state=1234, dataset_format='array')
```
### Create a pipeline
```python
from sklearn import set_config
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from flaml import AutoML
set_config(display='diagram')
imputer = SimpleImputer()
standardizer = StandardScaler()
automl = AutoML()
automl_pipeline = Pipeline([
("imputuer",imputer),
("standardizer", standardizer),
("automl", automl)
])
automl_pipeline
```
![png](images/pipeline.png)
### Run AutoML in the pipeline
```python
settings = {
"time_budget": 60, # total running time in seconds
"metric": 'accuracy', # primary metrics can be chosen from: ['accuracy','roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'f1','log_loss','mae','mse','r2']
"task": 'classification', # task type
"estimator_list":['xgboost','catboost','lgbm'],
"log_file_name": 'airlines_experiment.log', # flaml log file
}
automl_pipeline.fit(X_train, y_train,
automl__time_budget=60,
automl__metric="accuracy")
```
### Get the automl object from the pipeline
```python
automl = automl_pipeline.steps[2][1]
# Get the best config and best learner
print('Best ML leaner:', automl.best_estimator)
print('Best hyperparmeter config:', automl.best_config)
print('Best accuracy on validation data: {0:.4g}'.format(1 - automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))
```
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/integrate_sklearn.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/integrate_sklearn.ipynb)

View File

@@ -0,0 +1,187 @@
# Tune - HuggingFace
This example uses flaml to finetune a transformer model from Huggingface transformers library.
### Requirements
This example requires GPU. Install dependencies:
```python
pip install torch transformers datasets "flaml[blendsearch,ray]"
```
### Prepare for tuning
#### Tokenizer
```python
from transformers import AutoTokenizer
MODEL_NAME = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
COLUMN_NAME = "sentence"
def tokenize(examples):
return tokenizer(examples[COLUMN_NAME], truncation=True)
```
#### Define training method
```python
import flaml
import datasets
from transformers import AutoModelForSequenceClassification
TASK = "cola"
NUM_LABELS = 2
def train_distilbert(config: dict):
# Load CoLA dataset and apply tokenizer
cola_raw = datasets.load_dataset("glue", TASK)
cola_encoded = cola_raw.map(tokenize, batched=True)
train_dataset, eval_dataset = cola_encoded["train"], cola_encoded["validation"]
model = AutoModelForSequenceClassification.from_pretrained(
MODEL_NAME, num_labels=NUM_LABELS
)
metric = datasets.load_metric("glue", TASK)
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = np.argmax(predictions, axis=1)
return metric.compute(predictions=predictions, references=labels)
training_args = TrainingArguments(
output_dir='.',
do_eval=False,
disable_tqdm=True,
logging_steps=20000,
save_total_limit=0,
**config,
)
trainer = Trainer(
model,
training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
# train model
trainer.train()
# evaluate model
eval_output = trainer.evaluate()
# report the metric to optimize & the metric to log
flaml.tune.report(
loss=eval_output["eval_loss"],
matthews_correlation=eval_output["eval_matthews_correlation"],
)
```
### Define the search
We are now ready to define our search. This includes:
- The `search_space` for our hyperparameters
- The `metric` and the `mode` ('max' or 'min') for optimization
- The constraints (`n_cpus`, `n_gpus`, `num_samples`, and `time_budget_s`)
```python
max_num_epoch = 64
search_space = {
# You can mix constants with search space objects.
"num_train_epochs": flaml.tune.loguniform(1, max_num_epoch),
"learning_rate": flaml.tune.loguniform(1e-6, 1e-4),
"adam_epsilon": flaml.tune.loguniform(1e-9, 1e-7),
"adam_beta1": flaml.tune.uniform(0.8, 0.99),
"adam_beta2": flaml.tune.loguniform(98e-2, 9999e-4),
}
# optimization objective
HP_METRIC, MODE = "matthews_correlation", "max"
# resources
num_cpus = 4
num_gpus = 4 # change according to your GPU resources
# constraints
num_samples = -1 # number of trials, -1 means unlimited
time_budget_s = 3600 # time budget in seconds
```
### Launch the tuning
We are now ready to launch the tuning using `flaml.tune.run`:
```python
import ray
ray.init(num_cpus=num_cpus, num_gpus=num_gpus)
print("Tuning started...")
analysis = flaml.tune.run(
train_distilbert,
search_alg=flaml.CFO(
space=search_space,
metric=HP_METRIC,
mode=MODE,
low_cost_partial_config={"num_train_epochs": 1}),
resources_per_trial={"gpu": num_gpus, "cpu": num_cpus},
local_dir='logs/',
num_samples=num_samples,
time_budget_s=time_budget_s,
use_ray=True,
)
```
This will run tuning for one hour. At the end we will see a summary.
```
== Status ==
Memory usage on this node: 32.0/251.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/4 GPUs, 0.0/150.39 GiB heap, 0.0/47.22 GiB objects (0/1.0 accelerator_type:V100)
Result logdir: /home/chiw/FLAML/notebook/logs/train_distilbert_2021-05-07_02-35-58
Number of trials: 22/infinite (22 TERMINATED)
Trial name status loc adam_beta1 adam_beta2 adam_epsilon learning_rate num_train_epochs iter total time (s) loss matthews_correlation
train_distilbert_a0c303d0 TERMINATED 0.939079 0.991865 7.96945e-08 5.61152e-06 1 1 55.6909 0.587986 0
train_distilbert_a0c303d1 TERMINATED 0.811036 0.997214 2.05111e-09 2.05134e-06 1.44427 1 71.7663 0.603018 0
train_distilbert_c39b2ef0 TERMINATED 0.909395 0.993715 1e-07 5.26543e-06 1 1 53.7619 0.586518 0
train_distilbert_f00776e2 TERMINATED 0.968763 0.990019 4.38943e-08 5.98035e-06 1.02723 1 56.8382 0.581313 0
train_distilbert_11ab3900 TERMINATED 0.962198 0.991838 7.09296e-08 5.06608e-06 1 1 54.0231 0.585576 0
train_distilbert_353025b6 TERMINATED 0.91596 0.991892 8.95426e-08 6.21568e-06 2.15443 1 98.3233 0.531632 0.388893
train_distilbert_5728a1de TERMINATED 0.926933 0.993146 1e-07 1.00902e-05 1 1 55.3726 0.538505 0.280558
train_distilbert_9394c2e2 TERMINATED 0.928106 0.990614 4.49975e-08 3.45674e-06 2.72935 1 121.388 0.539177 0.327295
train_distilbert_b6543fec TERMINATED 0.876896 0.992098 1e-07 7.01176e-06 1.59538 1 76.0244 0.527516 0.379177
train_distilbert_0071f998 TERMINATED 0.955024 0.991687 7.39776e-08 5.50998e-06 2.90939 1 126.871 0.516225 0.417157
train_distilbert_2f830be6 TERMINATED 0.886931 0.989628 7.6127e-08 4.37646e-06 1.53338 1 73.8934 0.551629 0.0655887
train_distilbert_7ce03f12 TERMINATED 0.984053 0.993956 8.70144e-08 7.82557e-06 4.08775 1 174.027 0.523732 0.453549
train_distilbert_aaab0508 TERMINATED 0.940707 0.993946 1e-07 8.91979e-06 3.40243 1 146.249 0.511288 0.45085
train_distilbert_14262454 TERMINATED 0.99 0.991696 4.60093e-08 4.83405e-06 3.4954 1 152.008 0.53506 0.400851
train_distilbert_6d211fe6 TERMINATED 0.959277 0.994556 5.40791e-08 1.17333e-05 6.64995 1 271.444 0.609851 0.526802
train_distilbert_c980bae4 TERMINATED 0.99 0.993355 1e-07 5.21929e-06 2.51275 1 111.799 0.542276 0.324968
train_distilbert_6d0d29d6 TERMINATED 0.965773 0.995182 9.9752e-08 1.15549e-05 13.694 1 527.944 0.923802 0.549474
train_distilbert_b16ea82a TERMINATED 0.952781 0.993931 2.93182e-08 1.19145e-05 3.2293 1 139.844 0.533466 0.451307
train_distilbert_eddf7cc0 TERMINATED 0.99 0.997109 8.13498e-08 1.28515e-05 15.5807 1 614.789 0.983285 0.56993
train_distilbert_43008974 TERMINATED 0.929089 0.993258 1e-07 1.03892e-05 12.0357 1 474.387 0.857461 0.520022
train_distilbert_b3408a4e TERMINATED 0.99 0.993809 4.67441e-08 1.10418e-05 11.9165 1 474.126 0.828205 0.526164
train_distilbert_cfbfb220 TERMINATED 0.979454 0.9999 1e-07 1.49578e-05 20.3715
```
### Retrieve the results
```python
best_trial = analysis.get_best_trial(HP_METRIC, MODE, "all")
metric = best_trial.metric_analysis[HP_METRIC][MODE]
print(f"n_trials={len(analysis.trials)}")
print(f"time={time.time()-start_time}")
print(f"Best model eval {HP_METRIC}: {metric:.4f}")
print(f"Best model parameters: {best_trial.config}")
# n_trials=22
# time=3999.769361972809
# Best model eval matthews_correlation: 0.5699
# Best model parameters: {'num_train_epochs': 15.580684188655825, 'learning_rate': 1.2851507818900338e-05, 'adam_epsilon': 8.134982521948352e-08, 'adam_beta1': 0.99, 'adam_beta2': 0.9971094424784387}
```
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/tune_huggingface.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/tune_huggingface.ipynb)

View File

@@ -0,0 +1,286 @@
# Tune - PyTorch
This example uses flaml to tune a pytorch model on CIFAR10.
## Prepare for tuning
### Requirements
```bash
pip install torchvision "flaml[blendsearch,ray]"
```
Before we are ready for tuning, we first need to define the neural network that we would like to tune.
### Network Specification
```python
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import random_split
import torchvision
import torchvision.transforms as transforms
class Net(nn.Module):
def __init__(self, l1=120, l2=84):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, l1)
self.fc2 = nn.Linear(l1, l2)
self.fc3 = nn.Linear(l2, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
```
### Data
```python
def load_data(data_dir="data"):
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
trainset = torchvision.datasets.CIFAR10(
root=data_dir, train=True, download=True, transform=transform)
testset = torchvision.datasets.CIFAR10(
root=data_dir, train=False, download=True, transform=transform)
return trainset, testset
```
### Training
```python
from ray import tune
def train_cifar(config, checkpoint_dir=None, data_dir=None):
if "l1" not in config:
logger.warning(config)
net = Net(2**config["l1"], 2**config["l2"])
device = "cpu"
if torch.cuda.is_available():
device = "cuda:0"
if torch.cuda.device_count() > 1:
net = nn.DataParallel(net)
net.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9)
# The `checkpoint_dir` parameter gets passed by Ray Tune when a checkpoint
# should be restored.
if checkpoint_dir:
checkpoint = os.path.join(checkpoint_dir, "checkpoint")
model_state, optimizer_state = torch.load(checkpoint)
net.load_state_dict(model_state)
optimizer.load_state_dict(optimizer_state)
trainset, testset = load_data(data_dir)
test_abs = int(len(trainset) * 0.8)
train_subset, val_subset = random_split(
trainset, [test_abs, len(trainset) - test_abs])
trainloader = torch.utils.data.DataLoader(
train_subset,
batch_size=int(2**config["batch_size"]),
shuffle=True,
num_workers=4)
valloader = torch.utils.data.DataLoader(
val_subset,
batch_size=int(2**config["batch_size"]),
shuffle=True,
num_workers=4)
for epoch in range(int(round(config["num_epochs"]))): # loop over the dataset multiple times
running_loss = 0.0
epoch_steps = 0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
inputs, labels = inputs.to(device), labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
epoch_steps += 1
if i % 2000 == 1999: # print every 2000 mini-batches
print("[%d, %5d] loss: %.3f" % (epoch + 1, i + 1,
running_loss / epoch_steps))
running_loss = 0.0
# Validation loss
val_loss = 0.0
val_steps = 0
total = 0
correct = 0
for i, data in enumerate(valloader, 0):
with torch.no_grad():
inputs, labels = data
inputs, labels = inputs.to(device), labels.to(device)
outputs = net(inputs)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
loss = criterion(outputs, labels)
val_loss += loss.cpu().numpy()
val_steps += 1
# Here we save a checkpoint. It is automatically registered with
# Ray Tune and will potentially be passed as the `checkpoint_dir`
# parameter in future iterations.
with tune.checkpoint_dir(step=epoch) as checkpoint_dir:
path = os.path.join(checkpoint_dir, "checkpoint")
torch.save(
(net.state_dict(), optimizer.state_dict()), path)
tune.report(loss=(val_loss / val_steps), accuracy=correct / total)
print("Finished Training")
```
### Test Accuracy
```python
def _test_accuracy(net, device="cpu"):
trainset, testset = load_data()
testloader = torch.utils.data.DataLoader(
testset, batch_size=4, shuffle=False, num_workers=2)
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
return correct / total
```
## Hyperparameter Optimization
```python
import numpy as np
import flaml
import os
data_dir = os.path.abspath("data")
load_data(data_dir) # Download data for all trials before starting the run
```
### Search space
```python
max_num_epoch = 100
config = {
"l1": tune.randint(2, 9), # log transformed with base 2
"l2": tune.randint(2, 9), # log transformed with base 2
"lr": tune.loguniform(1e-4, 1e-1),
"num_epochs": tune.loguniform(1, max_num_epoch),
"batch_size": tune.randint(1, 5) # log transformed with base 2
}
```
### Budget and resource constraints
```python
time_budget_s = 600 # time budget in seconds
gpus_per_trial = 0.5 # number of gpus for each trial; 0.5 means two training jobs can share one gpu
num_samples = 500 # maximal number of trials
np.random.seed(7654321)
```
### Launch the tuning
```python
import time
start_time = time.time()
result = flaml.tune.run(
tune.with_parameters(train_cifar, data_dir=data_dir),
config=config,
metric="loss",
mode="min",
low_cost_partial_config={"num_epochs": 1},
max_resource=max_num_epoch,
min_resource=1,
scheduler="asha", # Use asha scheduler to perform early stopping based on intermediate results reported
resources_per_trial={"cpu": 1, "gpu": gpus_per_trial},
local_dir='logs/',
num_samples=num_samples,
time_budget_s=time_budget_s,
use_ray=True)
```
### Check the result
```python
print(f"#trials={len(result.trials)}")
print(f"time={time.time()-start_time}")
best_trial = result.get_best_trial("loss", "min", "all")
print("Best trial config: {}".format(best_trial.config))
print("Best trial final validation loss: {}".format(
best_trial.metric_analysis["loss"]["min"]))
print("Best trial final validation accuracy: {}".format(
best_trial.metric_analysis["accuracy"]["max"]))
best_trained_model = Net(2**best_trial.config["l1"],
2**best_trial.config["l2"])
device = "cpu"
if torch.cuda.is_available():
device = "cuda:0"
if gpus_per_trial > 1:
best_trained_model = nn.DataParallel(best_trained_model)
best_trained_model.to(device)
checkpoint_path = os.path.join(best_trial.checkpoint.value, "checkpoint")
model_state, optimizer_state = torch.load(checkpoint_path)
best_trained_model.load_state_dict(model_state)
test_acc = _test_accuracy(best_trained_model, device)
print("Best trial test set accuracy: {}".format(test_acc))
```
### Sample of output
```
#trials=44
time=1193.913584947586
Best trial config: {'l1': 8, 'l2': 8, 'lr': 0.0008818671030627281, 'num_epochs': 55.9513429004283, 'batch_size': 3}
Best trial final validation loss: 1.0694482081472874
Best trial final validation accuracy: 0.6389
Files already downloaded and verified
Files already downloaded and verified
Best trial test set accuracy: 0.6294
```
[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/tune_pytorch.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/tune_pytorch.ipynb)

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.4 KiB

30
website/docs/FAQ.md Normal file
View File

@@ -0,0 +1,30 @@
# Frequently Asked Questions
### About `low_cost_partial_config` in `tune`.
- Definition and purpose: The `low_cost_partial_config` is a dictionary of subset of the hyperparameter coordinates whose value corresponds to a configuration with known low-cost (i.e., low computation cost for training the corresponding model). The concept of low/high-cost is meaningful in the case where a subset of the hyperparameters to tune directly affects the computation cost for training the model. For example, `n_estimators` and `max_leaves` are known to affect the training cost of tree-based learners. We call this subset of hyperparameters, *cost-related hyperparameters*. In such scenarios, if you are aware of low-cost configurations for the cost-related hyperparameters, you are recommended to set them as the `low_cost_partial_config`. Using the tree-based method example again, since we know that small `n_estimators` and `max_leaves` generally correspond to simpler models and thus lower cost, we set `{'n_estimators': 4, 'max_leaves': 4}` as the `low_cost_partial_config` by default (note that `4` is the lower bound of search space for these two hyperparameters), e.g., in [LGBM](https://github.com/microsoft/FLAML/blob/main/flaml/model.py#L215). Configuring `low_cost_partial_config` helps the search algorithms make more cost-efficient choices.
In AutoML, the `low_cost_init_value` in `search_space()` function for each estimator serves the same role.
- Usage in practice: It is recommended to configure it if there are cost-related hyperparameters in your tuning task and you happen to know the low-cost values for them, but it is not required( It is fine to leave it the default value, i.e., `None`).
- How does it work: `low_cost_partial_config` if configured, will be used as an initial point of the search. It also affects the search trajectory. For more details about how does it play a role in the search algorithms, please refer to the papers about the search algorithms used: Section 2 of [Frugal Optimization for Cost-related Hyperparameters (CFO)](https://arxiv.org/pdf/2005.01571.pdf) and Section 3 of [Economical Hyperparameter Optimization with Blended Search Strategy (BlendSearch)](https://openreview.net/pdf?id=VbLH04pRA3).
### How does FLAML handle imbalanced data (unequal distribution of target classes in classification task)?
Currently FLAML does several things for imbalanced data.
1. When a class contains fewer than 20 examples, we repeatedly add these examples to the training data until the count is at least 20.
2. We use stratified sampling when doing holdout and kf.
3. We make sure no class is empty in both training and holdout data.
4. We allow users to pass `sample_weight` to `AutoML.fit()`.
### How to interpret model performance? Is it possible for me to visualize feature importance, SHAP values, optimization history?
You can use ```automl.model.estimator.feature_importances_``` to get the `feature_importances_` for the best model found by automl. See an [example](Examples/AutoML-for-XGBoost#plot-feature-importance).
Packages such as `azureml-interpret` and `sklearn.inspection.permutation_importance` can be used on `automl.model.estimator` to explain the selected model.
Model explanation is frequently asked and adding a native support may be a good feature. Suggestions/contributions are welcome.
Optimization history can be checked from the [log](Use-Cases/Task-Oriented-AutoML#log-the-trials). You can also [retrieve the log and plot the learning curve](Use-Cases/Task-Oriented-AutoML#plot-learning-curve).

View File

@@ -0,0 +1,87 @@
# Getting Started
<!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! -->
FLAML is a lightweight Python library that finds accurate machine
learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner.
### Main Features
1. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classifcal machine learning models and deep neural networks.
2. It is easy to customize or extend. Users can choose their desired customizability: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code).
3. It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
hyperparameter optimization](Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm)
and learner selection method invented by Microsoft Research.
### Quickstart
Install FLAML from pip: `pip install flaml`. Find more options in [Installation](Installation).
There are two ways of using flaml:
#### [Task-oriented AutoML](Use-Cases/task-oriented-automl)
For example, with three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.
```python
from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")
```
It automatically tunes the hyparparameters and selects the best model from default learners such as LightGBM, XGBoost, random forest etc. [Customizing](Use-Cases/task-oriented-automl#customize-automlfit) the optimization metrics, learners and search spaces etc. is very easy. For example,
```python
automl.add_learner("mylgbm", MyLGBMEstimator)
automl.fit(X_train, y_train, task="classification", metric=custom_metric, estimator_list=["mylgbm"])
```
#### [Tune user-defined function](Use-Cases/Tune-User-Defined-Function)
You can run generic hyperparameter tuning for a custom function (machine learning or beyond). For example,
```python
from flaml import tune
from flaml.model import LGBMEstimator
def train_lgbm(config: dict) -> dict:
# convert config dict to lgbm params
params = LGBMEstimator(**config).params
num_boost_round = params.pop("n_estimators")
# train the model
train_set = lightgbm.Dataset(X_train, y_train)
model = lightgbm.train(params, train_set, num_boost_round)
# evaluate the model
pred = model.predict(X_test)
mse = mean_squared_error(y_test, pred)
# return eval results as a dictionary
return {"mse": mse}
# load a built-in search space from flaml
flaml_lgbm_search_space = LGBMEstimator.search_space(X_train.shape)
# specify the search space as a dict from hp name to domain; you can define your own search space same way
config_search_space = {hp: space["domain"] for hp, space in flaml_lgbm_search_space.items()}
# give guidance about hp values corresponding to low training cost, i.e., {"n_estimators": 4, "num_leaves": 4}
low_cost_partial_config = {
hp: space["low_cost_init_value"]
for hp, space in flaml_lgbm_search_space.items()
if "low_cost_init_value" in space
}
# run the tuning, minimizing mse, with total time budget 3 seconds
analysis = tune.run(
train_lgbm, metric="mse", mode="min", config=config_search_space,
low_cost_partial_config=low_cost_partial_config, time_budget_s=3, num_samples=-1,
)
```
### Where to Go Next?
* Understand the use cases for [Task-oriented AutoML](Use-Cases/task-oriented-automl) and [Tune user-defined function](Use-Cases/Tune-User-Defined-Function).
* Find code examples under "Examples": from [AutoML - Classification](Examples/AutoML-Classification) to [Tune - PyTorch](Examples/Tune-PyTorch).
* Watch [video tutorials](https://www.youtube.com/channel/UCfU0zfFXHXdAd5x-WvFBk5A).
* Learn about [research](Research) around FLAML.
* Refer to [SDK](reference/automl) and [FAQ](FAQ).
If you like our project, please give it a [star](https://github.com/microsoft/FLAML/stargazers) on GitHub. If you are interested in contributing, please read [Contributor's Guide](Contribute).

View File

@@ -0,0 +1,63 @@
# Installation
FLAML requires **Python version >= 3.6**. It can be installed from pip:
```bash
pip install flaml
```
or conda:
```
conda install flaml -c conda-forge
```
FLAML has a .NET implementation as well from [ML.NET Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) in [Visual Studio](https://visualstudio.microsoft.com/) 2022.
## Optional Dependencies
### Notebook
To run the [notebook examples](https://github.com/microsoft/FLAML/tree/main/notebook),
install flaml with the [notebook] option:
```bash
pip install flaml[notebook]
```
### Extra learners
* catboost
```bash
pip install flaml[catboost]
```
* vowpal wabbit
```bash
pip install flaml[vw]
```
* time series forecaster: prophet, statsmodels
```bash
pip install flaml[forecast]
```
### Distributed tuning
* ray
```bash
pip install flaml[ray]
```
* nni
```bash
pip install flaml[nni]
```
* blendsearch
```bash
pip install flaml[blendsearch]
```
### Test and Benchmark
* test
```bash
pip install flaml[test]
```
* benchmark
```bash
pip install flaml[benchmark]
```

20
website/docs/Research.md Normal file
View File

@@ -0,0 +1,20 @@
# Research in FLAML
For technical details, please check our research publications.
* [FLAML: A Fast and Lightweight AutoML Library](https://www.microsoft.com/en-us/research/publication/flaml-a-fast-and-lightweight-automl-library/). Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.
```bibtex
@inproceedings{wang2021flaml,
title={FLAML: A Fast and Lightweight AutoML Library},
author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
year={2021},
booktitle={MLSys},
}
```
* [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
* [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
* [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.
Many researchers and engineers have contributed to the technology development. In alphabetical order: Vijay Aski, Sebastien Bubeck, Surajit Chaudhuri, Kevin Chen, Yi Wei Chen, Nadiia Chepurko, Ofer Dekel, Alex Deng, Anshuman Dutt, Nicolo Fusi, Jianfeng Gao, Johannes Gehrke, Niklas Gustafsson, Silu Huang, Moe Kayali, Dongwoo Kim, Christian Konig, John Langford, Menghao Li, Mingqin Li, Xueqing Liu, Zhe Liu, Naveen Gaur, Paul Mineiro, Vivek Narasayya, Jake Radzikowski, Marco Rossi, Amin Saied, Neil Tenenholtz, Olga Vrousgou, Chi Wang, Yue Wang, Markus Weimer, Qingyun Wu, Qiufeng Yin, Haozhe Zhang, Minjia Zhang, XiaoYun Zhang, Eric Zhu.

View File

@@ -0,0 +1,478 @@
# Task Oriented AutoML
## Overview
`flaml.AutoML` is a class for task-oriented AutoML. It can be used as a scikit-learn style estimator with the standard `fit` and `predict` functions. The minimal inputs from users are the training data and the task type.
* Training data:
- numpy array. When the input data are stored in numpy array, they are passed to `fit()` as `X_train` and `y_train`.
- pandas dataframe. When the input data are stored in pandas dataframe, they are passed to `fit()` either as `X_train` and `y_train`, or as `dataframe` and `label`.
* Tasks (specified via `task`):
- 'classification': classification.
- 'regression': regression.
- 'ts_forecast': time series forecasting.
- 'rank': learning to rank.
- 'seq-classification': sequence classification.
- 'seq-regression': sequence regression.
An optional input is `time_budget` for searching models and hyperparameters. When not specified, a default budget of 60 seconds will be used.
A typical way to use `flaml.AutoML`:
```python
# Prepare training data
# ...
from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="regression", time_budget=60, **other_settings)
# Save the model
with open("automl.pkl", "wb") as f:
pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)
# At prediction time
with open("automl.pkl", "rb") as f:
automl = pickle.load(f)
pred = automl.predict(X_test)
```
If users provide the minimal inputs only, `AutoML` uses the default settings for time budget, optimization metric, estimator list etc.
## Customize AutoML.fit()
### Optimization metric
The optimization metric is specified via the `metric` argument. It can be either a string which refers to a built-in metric, or a user-defined function.
* Built-in metric.
- 'accuracy': 1 - accuracy as the corresponding metric to minimize.
- 'log_loss': default metric for multiclass classification.
- 'r2': 1 - r2_score as the corresponding metric to minimize. Default metric for regression.
- 'rmse': root mean squared error.
- 'mse': mean squared error.
- 'mae': mean absolute error.
- 'mape': mean absolute percentage error.
- 'roc_auc': minimize 1 - roc_auc_score. Default metric for binary classification.
- 'roc_auc_ovr': minimize 1 - roc_auc_score with `multi_class="ovr"`.
- 'roc_auc_ovo': minimize 1 - roc_auc_score with `multi_class="ovo"`.
- 'f1': minimize 1 - f1_score.
- 'micro_f1': minimize 1 - f1_score with `average="micro"`.
- 'micro_f1': minimize 1 - f1_score with `average="micro"`.
- 'ap': minimize 1 - average_precision_score.
- 'ndcg': minimize 1 - ndcg_score.
- 'ndcg@k': minimize 1 - ndcg_score@k. k is an integer.
* User-defined function.
A customized metric function that requires the following (input) signature, and returns the input configs value in terms of the metric you want to minimize, and a dictionary of auxiliary information at your choice:
```python
def custom_metric(
X_val, y_val, estimator, labels,
X_train, y_train, weight_val=None, weight_train=None,
config=None, groups_val=None, groups_train=None,
):
return metric_to_minimize, metrics_to_log
```
For example,
```python
def custom_metric(
X_val, y_val, estimator, labels,
X_train, y_train, weight_val=None, weight_train=None,
**args,
):
from sklearn.metrics import log_loss
import time
start = time.time()
y_pred = estimator.predict_proba(X_val)
pred_time = (time.time() - start) / len(X_val)
val_loss = log_loss(y_val, y_pred, labels=labels, sample_weight=weight_val)
y_pred = estimator.predict_proba(X_train)
train_loss = log_loss(y_train, y_pred, labels=labels, sample_weight=weight_train)
alpha = 0.5
return val_loss * (1 + alpha) - alpha * train_loss, {
"val_loss": val_loss,
"train_loss": train_loss,
"pred_time": pred_time,
}
```
It returns the validation loss penalized by the gap between validation and training loss as the metric to minimize, and three metrics to log: val_loss, train_loss and pred_time. The arguments `config`, `groups_val` and `groups_train` are not used in the function.
### Estimator and search space
The estimator list can contain one or more estimator names, each corresponding to a built-in estimator or a custom estimator. Each estimator has a search space for hyperparameter configurations. FLAML supports both classical machine learning models and deep neural networks.
#### Estimator
* Built-in estimator.
- 'lgbm': LGBMEstimator. Hyperparameters: n_estimators, num_leaves, min_child_samples, learning_rate, log_max_bin (logarithm of (max_bin + 1) with base 2), colsample_bytree, reg_alpha, reg_lambda.
- 'xgboost': XGBoostSkLearnEstimator. Hyperparameters: n_estimators, max_leaves, max_depth, min_child_weight, learning_rate, subsample, colsample_bylevel, colsample_bytree, reg_alpha, reg_lambda.
- 'rf': RandomForestEstimator. Hyperparameters: n_estimators, max_features, max_leaves, criterion (for classification only).
- 'extra_tree': ExtraTreesEstimator. Hyperparameters: n_estimators, max_features, max_leaves, criterion (for classification only).
- 'lrl1': LRL1Classifier (sklearn.LogisticRegression with L1 regularization). Hyperparameters: C.
- 'lrl2': LRL2Classifier (sklearn.LogisticRegression with L2 regularization). Hyperparameters: C.
- 'catboost': CatBoostEstimator. Hyperparameters: early_stopping_rounds, learning_rate, n_estimators.
- 'kneighbor': KNeighborsEstimator. Hyperparameters: n_neighbors.
- 'prophet': Prophet. Hyperparameters: changepoint_prior_scale, seasonality_prior_scale, holidays_prior_scale, seasonality_mode.
- 'arima': ARIMA. Hyperparameters: p, d, q.
- 'sarimax': SARIMAX. Hyperparameters: p, d, q, P, D, Q, s.
- 'transformer': Huggingface transformer models. Hyperparameters: learning_rate, num_train_epochs, per_device_train_batch_size, warmup_ratio, weight_decay, adam_epsilon, seed.
* Custom estimator. Use custom estimator for:
- tuning an estimator that is not built-in;
- customizing search space for a built-in estimator.
To tune a custom estimator that is not built-in, you need to:
1. Build a custom estimator by inheritting `flaml.model.BaseEstimator` or a derived class.
For example, if you have a estimator class with scikit-learn style `fit()` and `predict()` functions, you only need to set `self.estimator_class` to be that class in your constructor.
```python
from flaml.model import SKLearnEstimator
# SKLearnEstimator is derived from BaseEstimator
import rgf
class MyRegularizedGreedyForest(SKLearnEstimator):
def __init__(self, task="binary", **config):
super().__init__(task, **config)
if task in CLASSIFICATION:
from rgf.sklearn import RGFClassifier
self.estimator_class = RGFClassifier
else:
from rgf.sklearn import RGFRegressor
self.estimator_class = RGFRegressor
@classmethod
def search_space(cls, data_size, task):
space = {
"max_leaf": {
"domain": tune.lograndint(lower=4, upper=data_size),
"low_cost_init_value": 4,
},
"n_iter": {
"domain": tune.lograndint(lower=1, upper=data_size),
"low_cost_init_value": 1,
},
"learning_rate": {"domain": tune.loguniform(lower=0.01, upper=20.0)},
"min_samples_leaf": {
"domain": tune.lograndint(lower=1, upper=20),
"init_value": 20,
},
}
return space
```
In the constructor, we set `self.estimator_class` as `RGFClassifier` or `RGFRegressor` according to the task type. If the estimator you want to tune does not have a scikit-learn style `fit()` and `predict()` API, you can override the `fit()` and `predict()` function of `flaml.model.BaseEstimator`, like [XGBoostEstimator](https://github.com/microsoft/FLAML/blob/59083fbdcb95c15819a0063a355969203022271c/flaml/model.py#L511).
2. Give the custom estimator a name and add it in AutoML. E.g.,
```python
from flaml import AutoML
automl = AutoML()
automl.add_learner("rgf", MyRegularizedGreedyForest)
```
This registers the `MyRegularizedGreedyForest` class in AutoML, with the name "rgf".
3. Tune the newly added custom estimator in either of the following two ways depending on your needs:
- tune rgf alone: `automl.fit(..., estimator_list=["rgf"])`; or
- mix it with other built-in learners: `automl.fit(..., estimator_list=["rgf", "lgbm", "xgboost", "rf"])`.
#### Search space
Each estimator class, built-in or not, must have a `search_space` function. In the `search_space` function, we return a dictionary about the hyperparameters, the keys of which are the names of the hyperparameters to tune, and each value is a set of detailed search configurations about the corresponding hyperparameters represented in a dictionary. A search configuration dictionary includes the following fields:
* `domain`, which specifies the possible values of the hyperparameter and their distribution. Please refer to [more details about the search space domain](Tune-User-Defined-Function#more-details-about-the-search-space-domain).
* `init_value` (optional), which specifies the initial value of the hyperparameter.
* `low_cost_init_value`(optional), which specifies the value of the hyperparameter that is associated with low computation cost. See [cost related hyperparameters](Tune-User-Defined-Function#cost-related-hyperparameters) or [FAQ](../FAQ#about-low_cost_partial_config-in-tune) for more details.
In the example above, we tune four hyperparameters, three integers and one float. They all follow a log-uniform distribution. "max_leaf" and "n_iter" have "low_cost_init_value" specified as their values heavily influence the training cost.
To customize the search space for a built-in estimator, use a similar approach to define a class that inherits the existing estimator. For example,
```python
from flaml.model import XGBoostEstimator
def logregobj(preds, dtrain):
labels = dtrain.get_label()
preds = 1.0 / (1.0 + np.exp(-preds)) # transform raw leaf weight
grad = preds - labels
hess = preds * (1.0 - preds)
return grad, hess
class MyXGB1(XGBoostEstimator):
"""XGBoostEstimator with logregobj as the objective function"""
def __init__(self, **config):
super().__init__(objective=logregobj, **config)
```
We override the constructor and set the training objective as a custom function `logregobj`. The hyperparameters and their search range do not change. For another example,
```python
class XGBoost2D(XGBoostSklearnEstimator):
@classmethod
def search_space(cls, data_size, task):
upper = min(32768, int(data_size))
return {
"n_estimators": {
"domain": tune.lograndint(lower=4, upper=upper),
"low_cost_init_value": 4,
},
"max_leaves": {
"domain": tune.lograndint(lower=4, upper=upper),
"low_cost_init_value": 4,
},
}
```
We override the `search_space` function to tune two hyperparameters only, "n_estimators" and "max_leaves". They are both random integers in the log space, ranging from 4 to data-dependent upper bound. The lower bound for each corresponds to low training cost, hence the "low_cost_init_value" for each is set to 4.
### Constraint
There are several types of constraints you can impose.
1. End-to-end constraints on the AutoML process.
- `time_budget`: constrains the wall-clock time (seconds) used by the AutoML process. We provide some tips on [how to set time budget](#how-to-set-time-budget).
- `max_iter`: constrains the maximal number of models to try in the AutoML process.
2. Constraints on the (hyperparameters of) the estimators.
Some constraints on the estimator can be implemented via the custom learner. For example,
```python
class MonotonicXGBoostEstimator(XGBoostSklearnEstimator):
@classmethod
def search_space(**args):
return super().search_space(**args).update({"monotone_constraints": "(1, -1)"})
```
It adds a monotonicity constraint to XGBoost. This approach can be used to set any constraint that is a parameter in the underlying estimator's constructor.
3. Constraints on the models tried in AutoML.
Users can set constraints such as the maximal number of models to try, limit on training time and prediction time per model.
* `train_time_limit`: training time in seconds.
* `pred_time_limit`: prediction time per instance in seconds.
For example,
```python
automl.fit(X_train, y_train, max_iter=100, train_time_limit=1, pred_time_limit=1e-3)
```
### Ensemble
To use stacked ensemble after the model search, set `ensemble=True` or a dict. When `ensemble=True`, the final estimator and `passthrough` in the stacker will be automatically chosen. You can specify customized final estimator or passthrough option:
* "final_estimator": an instance of the final estimator in the stacker.
* "passthrough": True (default) or False, whether to pass the original features to the stacker.
For example,
```python
automl.fit(
X_train, y_train, task="classification",
"ensemble": {
"final_estimator": LogisticRegression(),
"passthrough": False,
},
)
```
### Resampling strategy
By default, flaml decides the resampling automatically according to the data size and the time budget. If you would like to enforce a certain resampling strategy, you can set `eval_method` to be "holdout" or "cv" for holdout or cross-validation.
For holdout, you can also set:
* `split_ratio`: the fraction for validation data, 0.1 by default.
* `X_val`, `y_val`: a separate validation dataset. When they are passed, the validation metrics will be computed against this given validation dataset. If they are not passed, then a validation dataset will be split from the training data and held out from training during the model search. After the model search, flaml will retrain the model with best configuration on the full training data.
You can set`retrain_full` to be `False` to skip the final retraining or "budget" to ask flaml to do its best to retrain within the time budget.
For cross validation, you can also set `n_splits` of the number of folds. By default it is 5.
#### Data split method
By default, flaml uses the following method to split the data:
* stratified split for classification;
* uniform split for regression;
* time-based split for time series forecasting;
* group-based split for learning to rank.
The data split method for classification can be changed into uniform split by setting `split_type="uniform"`. For both classification and regression, time-based split can be enforced if the data are sorted by timestamps, by setting `split_type="time"`.
### Parallel tuning
When you have parallel resources, you can either spend them in training and keep the model search sequential, or perform parallel search. Following scikit-learn, the parameter `n_jobs` specifies how many CPU cores to use for each training job. The number of parallel trials is specified via the parameter `n_concurrent_trials`. By default, `n_jobs=-1, n_concurrent_trials=1`. That is, all the CPU cores (in a single compute node) are used for training a single model and the search is sequential. When you have more resources than what each single training job needs, you can consider increasing `n_concurrent_trials`.
To do parallel tuning, install the `ray` and `blendsearch` options:
```bash
pip install flaml[ray,blendsearch]
```
`ray` is used to manage the resources. For example,
```python
ray.init(n_cpus=16)
```
allocates 16 CPU cores. Then, when you run:
```python
automl.fit(X_train, y_train, n_jobs=4, n_concurrent_trials=4)
```
flaml will perform 4 trials in parallel, each consuming 4 CPU cores. The parallel tuning uses the [BlendSearch](Tune-User-Defined-Function##blendsearch-economical-hyperparameter-optimization-with-blended-search-strategy) algorithm.
### Warm start
We can warm start the AutoML by providing starting points of hyperparameter configurstions for each estimator. For example, if you have run AutoML for one hour, after checking the results, you would like to run it for another two hours, then you can use the best configurations found for each estimator as the starting points for the new run.
```python
automl1 = AutoML()
automl1.fit(X_train, y_train, time_budget=3600)
automl2 = AutoML()
automl2.fit(X_train, y_train, time_budget=7200, starting_points=automl1.best_config_per_estimator)
```
`starting_points` is a dictionary. The keys are the estimator names. If you do not need to specify starting points for an estimator, exclude its name from the dictionary. The value for each key can be either a dictionary of a list of dictionaries, corresponding to one hyperparameter configuration, or multiple hyperparameter configurations, respectively.
### Log the trials
The trials are logged in a file if a `log_file_name` is passed.
Each trial is logged as a json record in one line. The best trial's id is logged in the last line. For example,
```
{"record_id": 0, "iter_per_learner": 1, "logged_metric": null, "trial_time": 0.12717914581298828, "wall_clock_time": 0.1728971004486084, "validation_loss": 0.07333333333333332, "config": {"n_estimators": 4, "num_leaves": 4, "min_child_samples": 20, "learning_rate": 0.09999999999999995, "log_max_bin": 8, "colsample_bytree": 1.0, "reg_alpha": 0.0009765625, "reg_lambda": 1.0}, "learner": "lgbm", "sample_size": 150}
{"record_id": 1, "iter_per_learner": 3, "logged_metric": null, "trial_time": 0.07027268409729004, "wall_clock_time": 0.3756711483001709, "validation_loss": 0.05333333333333332, "config": {"n_estimators": 4, "num_leaves": 4, "min_child_samples": 12, "learning_rate": 0.2677050123105203, "log_max_bin": 7, "colsample_bytree": 1.0, "reg_alpha": 0.001348364934537134, "reg_lambda": 1.4442580148221913}, "learner": "lgbm", "sample_size": 150}
{"curr_best_record_id": 1}
```
1. `iter_per_learner` means how many models have been tried for each learner. The reason you see records like `iter_per_learner=3` for `record_id=1` is that flaml only logs better configs than the previous iters by default, i.e., `log_type='better'`. If you use `log_type='all'` instead, all the trials will be logged.
1. `trial_time` means the time taken to train and evaluate one config in that trial. `total_search_time` is the total time spent from the beginning of `fit()`.
1. flaml will adjust the `n_estimators` for lightgbm etc. according to the remaining budget and check the time budget constraint and stop in several places. Most of the time that makes `fit()` stops before the given budget. Occasionally it may run over the time budget slightly. But the log file always contains the best config info and you can recover the best model until any time point using `retrain_from_log()`.
We can also use mlflow for logging:
```python
mlflow.set_experiment("flaml")
with mlflow.start_run():
automl.fit(X_train=X_train, y_train=y_train, **settings)
```
### Extra fit arguments
Extra fit arguments that are needed by the estimators can be passed to `AutoML.fit()`. For example, if there is a weight associated with each training example, they can be passed via `sample_weight`. For another example, `period` can be passed for time series forecaster. For any extra keywork argument passed to `AutoML.fit()` which has not been explicitly listed in the function signature, it will be passed to the underlying estimators' `fit()` as is.
## Retrieve and analyze the outcomes of AutoML.fit()
### Get best model
The best model can be obtained by the `model` property of an `AutoML` instance. For example,
```python
automl.fit(X_train, y_train, task="regression")
print(automl.mdoel)
# <flaml.model.LGBMEstimator object at 0x7f9b502c4550>
```
`flaml.model.LGBMEstimator` is a wrapper class for LightGBM models. To access the underlying model, use the `estimator` property of the `flaml.model.LGBMEstimator` instance.
```python
print(automl.model.estimator)
'''
LGBMRegressor(colsample_bytree=0.7610534336273627,
learning_rate=0.41929025492645006, max_bin=255,
min_child_samples=4, n_estimators=45, num_leaves=4,
reg_alpha=0.0009765625, reg_lambda=0.009280655005879943,
verbose=-1)
'''
```
Just like a normal LightGBM model, we can inspect it. For example, we can plot the feature importance:
```python
import matplotlib.pyplot as plt
plt.barh(automl.model.estimator.feature_name_, automl.model.estimator.feature_importances_)
```
![png](images/feature_importance.png)
### Get best configuration
We can find the best estimator's name and best configuration by:
```python
print(automl.best_estimator)
# lgbm
print(automl.best_config)
# {'n_estimators': 148, 'num_leaves': 18, 'min_child_samples': 3, 'learning_rate': 0.17402065726724145, 'log_max_bin': 8, 'colsample_bytree': 0.6649148062238498, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.0067613624509965}
```
We can also find the best configuration per estimator.
```python
print(automl.best_config_per_estimator)
# {'lgbm': {'n_estimators': 148, 'num_leaves': 18, 'min_child_samples': 3, 'learning_rate': 0.17402065726724145, 'log_max_bin': 8, 'colsample_bytree': 0.6649148062238498, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.0067613624509965}, 'rf': None, 'catboost': None, 'xgboost': {'n_estimators': 4, 'max_leaves': 4, 'min_child_weight': 1.8630223791106992, 'learning_rate': 1.0, 'subsample': 0.8513627344387318, 'colsample_bylevel': 1.0, 'colsample_bytree': 0.946138073111236, 'reg_alpha': 0.0018311776973217073, 'reg_lambda': 0.27901659190538414}, 'extra_tree': {'n_estimators': 4, 'max_features': 1.0, 'max_leaves': 4}}
```
The `None` value corresponds to the estimators which have not been tried.
Other useful information:
```python
print(automl.best_config_train_time)
# 0.24841618537902832
print(automl.best_iteration)
# 10
print(automl.best_loss)
# 0.15448622217577546
print(automl.time_to_find_best_model)
# 0.4167296886444092
print(automl.config_history)
# {0: ('lgbm', {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}, 1.2300517559051514)}
# Meaning: at iteration 0, the config tried is {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0} for lgbm, and the wallclock time is 1.23s when this trial is finished.
```
### Plot learning curve
To plot how the loss is improved over time during the model search, first load the search history from the log file:
```python
from flaml.data import get_output_from_log
time_history, best_valid_loss_history, valid_loss_history, config_history, metric_history = \
get_output_from_log(filename=settings["log_file_name"], time_budget=120)
```
Then, assuming the optimization metric is "accuracy", we can plot the accuracy versus wallclock time:
```python
import matplotlib.pyplot as plt
import numpy as np
plt.title("Learning Curve")
plt.xlabel("Wall Clock Time (s)")
plt.ylabel("Validation Accuracy")
plt.step(time_history, 1 - np.array(best_valid_loss_history), where="post")
plt.show()
```
![png](images/curve.png)
The curve suggests that increasing the time budget may further improve the accuracy.
### How to set time budget
* If you have an exact constraint for the total search time, set it as the time budget.
* If you have flexible time constraints, for example, your desirable time budget is t1=60s, and the longest time budget you can tolerate is t2=3600s, you can try the following two ways:
1. set t1 as the time budget, and check the message in the console log in the end. If the budget is too small, you will see a warning like
> WARNING - Time taken to find the best model is 91% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.
2. set t2 as the time budget, and also set `early_stop=True`. If the early stopping is triggered, you will see a warning like
> WARNING - All estimator hyperparameters local search has converged at least once, and the total search time exceeds 10 times the time taken to find the best model.
> WARNING - Stopping search as early_stop is set to True.
### How much time is needed to find the best model
If you want to get a sense of how much time is needed to find the best model, you can use `max_iter=2` to perform two trials first. The message will be like:
> INFO - iteration 0, current learner lgbm
> INFO - Estimated sufficient time budget=145194s. Estimated necessary time budget=2118s.
> INFO - at 2.6s, estimator lgbm's best error=0.4459, best estimator lgbm's best error=0.4459
You will see that the time to finish the first and cheapest trial is 2.6 seconds. The estimated necessary time budget is 2118 seconds, and the estimated sufficient time budget is 145194 seconds. Note that this is only an estimated range to help you decide your budget.

View File

@@ -0,0 +1,550 @@
# Tune User Defined Function
`flaml.tune` is a module for economical hyperparameter tuning. It is used internally by `flaml.AutoML`. It can also be used to directly tune a user-defined function (UDF), which is not limited to machine learning model training. You can use `flaml.tune` instead of `flaml.AutoML` if one of the following is true:
1. Your machine learning task is not one of the built-in tasks from `flaml.AutoML`.
1. Your input cannot be represented as X_train + y_train or dataframe + label.
1. You want to tune a function that may not even be a machine learning procedure.
## Basic Tuning Procedure
There are three essential steps (assuming the knowledge of the set of hyperparameters to tune) to use `flaml.tune` to finish a basic tuning task:
1. Specify the [tuning objective](#tuning-objective) with respect to the hyperparameters.
1. Specify a [search space](#search-space) of the hyperparameters.
1. Specify [tuning constraints](#tuning-constraints), including constraints on the resource budget to do the tuning, constraints on the configurations, or/and constraints on a (or multiple) particular metric(s).
With these steps, you can [perform a basic tuning task](#put-together) accordingly.
### Tuning objective
Related arguments:
- `evaluation_function`: A user-defined evaluation function.
- `metric`: A string of the metric name to optimize for.
- `mode`: A string in ['min', 'max'] to specify the objective as minimization or maximization.
The first step is to specify your tuning objective.
To do it, you should first specify your evaluation procedure (e.g., perform a machine learning model training and validation) with respect to the hyperparameters in a user-defined function `evaluation_function`.
The function requires a hyperparameter configuration as input, and can simply return a metric value in a scalar or return a dictionary of metric name and metric value pairs.
In the following code, we define an evaluation function with respect to two hyperparameters named `x` and `y` according to $obj := (x-85000)^2 - x/y$. Note that we use this toy example here for more accessible demonstration purposes. In real use cases, the evaluation function usually cannot be written in this closed form, but instead involves a black-box and expensive evaluation procedure. Please check out [Tune HuggingFace](../Examples/Tune-HuggingFace), [Tune PyTorch](../Examples/Tune-PyTorch) and [Tune LightGBM](../Getting-Started#tune-user-defined-function) for real examples of tuning tasks.
```python
import time
def evaluate_config(config: dict):
"""evaluate a hyperparameter configuration"""
score = (config["x"] - 85000) ** 2 - config["x"] / config["y"]
# usually the evaluation takes an non-neglible cost
# and the cost could be related to certain hyperparameters
# here we simulate this cost by calling the time.sleep() function
# here we assume the cost is proportional to x
faked_evaluation_cost = config["x"] / 100000
time.sleep(faked_evaluation_cost)
# we can return a single float as a score on the input config:
# return score
# or, we can return a dictionary that maps metric name to metric value:
return {"score": score, "evaluation_cost": faked_evaluation_cost, "constraint_metric": x * y}
```
When the evaluation function returns a dictionary of metrics, you need to specify the name of the metric to optimize via the argument `metric` (this can be skipped when the function is just returning a scalar). In addition, you need to specify a mode of your optimization/tuning task (maximization or minimization) via the argument `mode` by choosing from "min" or "max".
For example,
```python
flaml.tune.run(evaluation_function=evaluate_config, metric="score", mode="min", ...)
```
### Search space
Related arguments:
- `config`: A dictionary to specify the search space.
- `low_cost_partial_config` (optional): A dictionary from a subset of controlled dimensions to the initial low-cost values.
- `cat_hp_cost` (optional): A dictionary from a subset of categorical dimensions to the relative cost of each choice.
The second step is to specify a search space of the hyperparameters through the argument `config`. In the search space, you need to specify valid values for your hyperparameters and can specify how these values are sampled (e.g., from a uniform distribution or a log-uniform distribution).
In the following code example, we include a search space for the two hyperparameters `x` and `y` as introduced above. The valid values for both are integers in the range of [1, 100000]. The values for `x` are sampled uniformly in the specified range (using `tune.randint(lower=1, upper=100000)`), and the values for `y` are sampled uniformly in logarithmic space of the specified range (using `tune.lograndit(lower=1, upper=100000)`).
```python
from flaml import tune
# construct a search space for the hyperparameters x and y.
config_search_space = {
"x": tune.lograndint(lower=1, upper=100000),
"y": tune.randint(lower=1, upper=100000)
}
# provide the search space to flaml.tune
flaml.tune.run(..., config=config_search_space, ...)
```
#### More details about the search space domain
The corresponding value of a particular hyperparameter in the search space dictionary is called a domain, for example, `tune.randint(lower=1, upper=100000)` is the domain for the hyperparameter `y`. The domain specifies a type and valid range to sample parameters from. Supported types include float, integer, and categorical. You can also specify how to sample values from certain distributions in linear scale or log scale.
It is a common practice to sample in log scale if the valid value range is large and the evaluation function changes more regularly with respect to the log domain.
See the example below for the commonly used types of domains.
```python
config = {
# Sample a float uniformly between -5.0 and -1.0
"uniform": tune.uniform(-5, -1),
# Sample a float uniformly between 3.2 and 5.4,
# rounding to increments of 0.2
"quniform": tune.quniform(3.2, 5.4, 0.2),
# Sample a float uniformly between 0.0001 and 0.01, while
# sampling in log space
"loguniform": tune.loguniform(1e-4, 1e-2),
# Sample a float uniformly between 0.0001 and 0.1, while
# sampling in log space and rounding to increments of 0.00005
"qloguniform": tune.qloguniform(1e-4, 1e-1, 5e-5),
# Sample a random float from a normal distribution with
# mean=10 and sd=2
"randn": tune.randn(10, 2),
# Sample a random float from a normal distribution with
# mean=10 and sd=2, rounding to increments of 0.2
"qrandn": tune.qrandn(10, 2, 0.2),
# Sample a integer uniformly between -9 (inclusive) and 15 (exclusive)
"randint": tune.randint(-9, 15),
# Sample a random uniformly between -21 (inclusive) and 12 (inclusive (!))
# rounding to increments of 3 (includes 12)
"qrandint": tune.qrandint(-21, 12, 3),
# Sample a integer uniformly between 1 (inclusive) and 10 (exclusive),
# while sampling in log space
"lograndint": tune.lograndint(1, 10),
# Sample a integer uniformly between 1 (inclusive) and 10 (inclusive (!)),
# while sampling in log space and rounding to increments of 2
"qlograndint": tune.qlograndint(1, 10, 2),
# Sample an option uniformly from the specified choices
"choice": tune.choice(["a", "b", "c"]),
}
```
<!-- Please refer to [ray.tune](https://docs.ray.io/en/latest/tune/api_docs/search_space.html#overview) for a more comprehensive introduction about possible choices of the domain. -->
#### Cost-related hyperparameters
Cost-related hyperparameters are a subset of the hyperparameters which directly affect the computation cost incurred in the evaluation of any hyperparameter configuration. For example, the number of estimators (`n_estimators`) and the maximum number of leaves (`max_leaves`) are known to affect the training cost of tree-based learners. So they are cost-related hyperparameters for tree-based learners.
When cost-related hyperparameters exist, the evaluation cost in the search space is heterogeneous.
In this case, designing a search space with proper ranges of the hyperparameter values is highly non-trivial. Classical tuning algorithms such as Bayesian optimization and random search are typically sensitive to such ranges. It may take them a very high cost to find a good choice if the ranges are too large. And if the ranges are too small, the optimal choice(s) may not be included and thus not possible to be found. With our method, you can use a search space with larger ranges in the case of heterogeneous cost.
Our search algorithms are designed to finish the tuning process at a low total cost when the evaluation cost in the search space is heterogeneous.
So in such scenarios, if you are aware of low-cost configurations for the cost-related hyperparameters, you are encouraged to set them as the `low_cost_partial_config`, which is a dictionary of a subset of the hyperparameter coordinates whose value corresponds to a configuration with known low cost. Using the example of the tree-based methods again, since we know that small `n_estimators` and `max_leaves` generally correspond to simpler models and thus lower cost, we set `{'n_estimators': 4, 'max_leaves': 4}` as the `low_cost_partial_config` by default (note that 4 is the lower bound of search space for these two hyperparameters), e.g., in LGBM. Please find more details on how the algorithm works [here](#cfo-frugal-optimization-for-cost-related-hyperparameters).
In addition, if you are aware of the cost relationship between different categorical hyperparameter choices, you are encouraged to provide this information through `cat_hp_cost`. It also helps the search algorithm to reduce the total cost.
### Tuning constraints
Related arguments:
- `time_budget_s`: The time budget in seconds.
- `num_samples`: An integer of the number of configs to try.
- `config_constraints` (optional): A list of config constraints to be satisfied.
- `metric_constraints` (optional): A list of metric constraints to be satisfied. e.g., `['precision', '>=', 0.9]`.
The third step is to specify constraints of the tuning task. One notable property of `flaml.tune` is that it is able to finish the tuning process (obtaining good results) within a required resource constraint. A user can either provide the resource constraint in terms of wall-clock time (in seconds) through the argument `time_budget_s`, or in terms of the number of trials through the argument `num_samples`. The following example shows three use cases:
```python
# Set a resource constraint of 60 seconds wall-clock time for the tuning.
flaml.tune.run(..., time_budget_s=60, ...)
# Set a resource constraint of 100 trials for the tuning.
flaml.tune.run(..., num_samples=100, ...)
# Use at most 60 seconds and at most 100 trials for the tuning.
flaml.tune.run(..., time_budget_s=60, num_samples=100, ...)
```
Optionally, you can provide a list of config constraints to be satisfied through the argument `config_constraints` and provide a list of metric constraints to be satisfied through the argument `metric_constraints`. We provide more details about related use cases in the [Advanced Tuning Options](#more-constraints-on-the-tuning) section.
### Put together
After the aforementioned key steps, one is ready to perform a tuning task by calling `flaml.tune.run()`. Below is a quick sequential tuning example using the pre-defined search space `config_search_space` and a minimization (`mode='min'`) objective for the `score` metric evaluated in `evaluate_config`, using the default serach algorithm in flaml. The time budget is 10 seconds (`time_budget_s=10`).
```python
# require: pip install flaml[blendsearch]
analysis = tune.run(
evaluate_config, # the function to evaluate a config
config=config_search_space, # the search space defined
metric="score",
mode="min", # the optimization mode, "min" or "max"
num_samples=-1, # the maximal number of configs to try, -1 means infinite
time_budget_s=10, # the time budget in seconds
)
```
### Result analysis
Once the tuning process finishes, it returns an [ExperimentAnalysis](../reference/tune/analysis) object, which provides methods to analyze the tuning.
In the following code example, we retrieve the best configuration found during the tuning, and retrieve the best trial's result from the returned `analysis`.
```python
analysis = tune.run(
evaluate_config, # the function to evaluate a config
config=config_search_space, # the search space defined
metric="score",
mode="min", # the optimization mode, "min" or "max"
num_samples=-1, # the maximal number of configs to try, -1 means infinite
time_budget_s=10, # the time budget in seconds
)
print(analysis.best_config) # the best config
print(analysis.best_trial.last_result) # the best trial's result
```
## Advanced Tuning Options
There are several advanced tuning options worth mentioning.
### More constraints on the tuning
A user can specify constraints on the configurations to be satisfied via the argument `config_constraints`. The `config_constraints` receives a list of such constraints to be satisfied. Specifically, each constraint is a tuple that consists of (1) a function that takes a configuration as input and returns a numerical value; (2) an operation chosen from "<=" or ">"; (3) a numerical threshold.
In the following code example, we constrain the output of `area`, which takes a configuration as input and outputs a numerical value, to be no larger than 1000.
```python
def area(config):
return config["width"] * config["height"]
flaml.tune.run(evaluation_function=evaluate_config, mode="min",
config=config_search_space,
config_constraints=[(area, "<=", 1000)], ...)
```
You can also specify a list of metric constraints to be satisfied via the argument `metric_constraints`. Each element in the `metric_constraints` list is a tuple that consists of (1) a string specifying the name of the metric (the metric name must be defined and returned in the user-defined `evaluation_function`); (2) an operation chosen from "<=" or ">"; (3) a numerical threshold.
In the following code example, we constrain the metric `score` to be no larger than 0.4.
```python
flaml.tune.run(evaluation_function=evaluate_config, mode="min",
config=config_search_space,
metric_constraints=[("score", "<=", 0.4)],...)
```
### Paralle tuning
Related arguments:
- `use_ray`: A boolean of whether to use ray as the backend.
- `resources_per_trial`: A dictionary of the hardware resources to allocate per trial, e.g., `{'cpu': 1}`. Only valid when using ray backend.
You can perform parallel tuning by specifying `use_ray=True` (requiring flaml[ray] option installed). You can also limit the amount of resources allocated per trial by specifying `resources_per_trial`, e.g., `resources_per_trial={'cpu': 2}`.
```python
# require: pip install flaml[ray]
analysis = tune.run(
evaluate_config, # the function to evaluate a config
config=config_search_space, # the search space defined
metric="score",
mode="min", # the optimization mode, "min" or "max"
num_samples=-1, # the maximal number of configs to try, -1 means infinite
time_budget_s=10, # the time budget in seconds
use_ray=True,
resources_per_trial={"cpu": 2} # limit resources allocated per trial
)
print(analysis.best_trial.last_result) # the best trial's result
print(analysis.best_config) # the best config
```
**A headsup about computation overhead.** When parallel tuning is used, there will be a certain amount of computation overhead in each trial. In case each trial's original cost is much smaller than the overhead, parallel tuning can underperform sequential tuning. Sequential tuning is recommended when compute resource is limited, and each trial can consume all the resources.
### Trial scheduling
Related arguments:
- `scheduler`: A scheduler for executing the trials.
- `resource_attr`: A string to specify the resource dimension used by the scheduler.
- `min_resource`: A float of the minimal resource to use for the resource_attr.
- `max_resource`: A float of the maximal resource to use for the resource_attr.
- `reduction_factor`: A float of the reduction factor used for incremental pruning.
A scheduler can help manage the trials' execution. It can be used to perform multi-fiedlity evalution, or/and early stopping. You can use two different types of schedulers in `flaml.tune` via `scheduler`.
#### 1. An authentic scheduler implemented in FLAML (`scheduler='flaml'`).
This scheduler is authentic to the new search algorithms provided by FLAML. In a nutshell, it starts the search with the minimum resource. It switches between HPO with the current resource and increasing the resource for evaluation depending on which leads to faster improvement.
If this scheduler is used, you need to
- Specify a resource dimension. Conceptually a 'resource dimension' is a factor that affects the cost of the evaluation (e.g., sample size, the number of epochs). You need to specify the name of the resource dimension via `resource_attr`. For example, if `resource_attr="sample_size"`, then the config dict passed to the `evaluation_function` would contain a key "sample_size" and its value suggested by the search algorithm. That value should be used in the evaluation function to control the compute cost. The larger is the value, the more expensive the evaluation is.
- Provide the lower and upper limit of the resource dimension via `min_resource` and `max_resource`, and optionally provide `reduction_factor`, which determines the magnitude of resource (multiplicative) increase when we decide to increase the resource.
In the following code example, we consider the sample size as the resource dimension. It determines how much data is used to perform training as reflected in the `evaluation_function`. We set the `min_resource` and `max_resource` to 1000 and the size of the full training dataset, respectively.
```python
from flaml import tune
from functools import partial
from flaml.data import load_openml_task
def obj_from_resource_attr(resource_attr, X_train, X_test, y_train, y_test, config):
from lightgbm import LGBMClassifier
from sklearn.metrics import accuracy_score
# in this example sample size is our resource dimension
resource = int(config[resource_attr])
sampled_X_train = X_train.iloc[:resource]
sampled_y_train = y_train[:resource]
# construct a LGBM model from the config
# note that you need to first remove the resource_attr field
# from the config as it is not part of the original search space
model_config = config.copy()
del model_config[resource_attr]
model = LGBMClassifier(**model_config)
model.fit(sampled_X_train, sampled_y_train)
y_test_predict = model.predict(X_test)
test_loss = 1.0 - accuracy_score(y_test, y_test_predict)
return {resource_attr: resource, "loss": test_loss}
X_train, X_test, y_train, y_test = load_openml_task(task_id=7592, data_dir="test/")
max_resource = len(y_train)
resource_attr = "sample_size"
min_resource = 1000
analysis = tune.run(
partial(obj_from_resource_attr, resource_attr, X_train, X_test, y_train, y_test),
config = {
"n_estimators": tune.lograndint(lower=4, upper=32768),
"max_leaves": tune.lograndint(lower=4, upper=32768),
"learning_rate": tune.loguniform(lower=1 / 1024, upper=1.0),
},
metric="loss",
mode="min",
resource_attr=resource_attr,
scheduler="flaml",
max_resource=max_resource,
min_resource=min_resource,
reduction_factor=2,
time_budget_s=10,
num_samples=-1,
)
```
You can find more details about this scheduler in [this paper](https://arxiv.org/pdf/1911.04706.pdf).
#### 2. A scheduler of the [`TrialScheduler`](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#tune-schedulers) class from `ray.tune`.
There is a handful of schedulers of this type implemented in `ray.tune`, for example, [ASHA](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#asha-tune-schedulers-ashascheduler), [HyperBand](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#tune-original-hyperband), [BOHB](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#tune-scheduler-bohb), etc.
To use this type of scheduler you can either (1) set `scheduler='asha'`, which will automatically create an [ASHAScheduler](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#asha-tune-schedulers-ashascheduler) instance using the provided inputs (`resource_attr`, `min_resource`, `max_resource`, and `reduction_factor`); or (2) create an instance by yourself and provided it via `scheduler`, as shown in the following code example,
```python
# require: pip install flaml[ray]
from ray.tune.schedulers import HyperBandScheduler
my_scheduler = HyperBandScheduler(time_attr="sample_size", max_t=max_resource, reduction_factor=2)
tune.run(.., scheduler=my_scheduler, ...)
```
- Similar to the case where the `flaml` scheduler is used, you need to specify the resource dimension, use the resource dimension accordingly in your `evaluation_function`, and provide the necessary information needed for scheduling, such as `min_resource`, `max_resource` and `reduction_factor` (depending on the requirements of the specific scheduler).
- Different from the case when the `flaml` scheduler is used, the amount of resources to use at each iteration is not suggested by the search algorithm through the `resource_attr` in a configuration. You need to specify the evaluation schedule explicitly by yourself in the `evaluation_function` and report intermediate results (using `tune.report()`) accordingly. In the following code example, we use the ASHA scheduler by setting `scheduler="asha"`, we specify `resource_attr`, `min_resource`, `min_resource` and `reduction_factor` the same way as in the previous example (when "flaml" is used as the scheduler). We perform the evaluation in a customized schedule.
```python
def obj_w_intermediate_report(resource_attr, X_train, X_test, y_train, y_test, min_resource, max_resource, config):
from lightgbm import LGBMClassifier
from sklearn.metrics import accuracy_score
# a customized schedule to perform the evaluation
eval_schedule = [res for res in range(min_resource, max_resource, 5000)] + [max_resource]
for resource in eval_schedule:
sampled_X_train = X_train.iloc[:resource]
sampled_y_train = y_train[:resource]
# construct a LGBM model from the config
model = LGBMClassifier(**config)
model.fit(sampled_X_train, sampled_y_train)
y_test_predict = model.predict(X_test)
test_loss = 1.0 - accuracy_score(y_test, y_test_predict)
# need to report the resource attribute used and the corresponding intermediate results
tune.report(sample_size=resource, loss=test_loss)
resource_attr = "sample_size"
min_resource = 1000
max_resource = len(y_train)
analysis = tune.run(
partial(obj_w_intermediate_report, resource_attr, X_train, X_test, y_train, y_test, min_resource, max_resource),
config={
"n_estimators": tune.lograndint(lower=4, upper=32768),
"learning_rate": tune.loguniform(lower=1 / 1024, upper=1.0),
},
metric="loss",
mode="min",
resource_attr=resource_attr,
scheduler="asha",
max_resource=max_resource,
min_resource=min_resource,
reduction_factor=2,
time_budget_s=10,
num_samples = -1,
)
```
### Warm start
Related arguments:
- `points_to_evaluate`: A list of initial hyperparameter configurations to run first.
- `evaluated_rewards`: If you have previously evaluated the parameters passed in as `points_to_evaluate` , you can avoid re-running those trials by passing in the reward attributes as a list so the optimizer can be told the results without needing to re-compute the trial. Must be the same length as `points_to_evaluate`.
If you are aware of some good hyperparameter configurations, you are encouraged to provide them via `points_to_evaluate`. The search algorithm will try them first and use them to bootstrap the search.
You can use previously evaluated configurations to warm-start your tuning.
For example, the following code means that you know the reward for the two configs in
points_to_evaluate are 3.99 and 1.99, respectively, and want to
inform `tune.run()`.
```python
def simple_obj(config):
return config["a"] + config["b"]
from flaml import tune
config_search_space = {
"a": tune.uniform(lower=0, upper=0.99),
"b": tune.uniform(lower=0, upper=3)
}
points_to_evaluate = [
{"b": .99, "a": 3},
{"b": .99, "a": 2},
]
evaluated_rewards = [3.99, 2.99]
analysis = tune.run(
simple_obj,
config=config_search_space,
mode="max",
points_to_evaluate=points_to_evaluate,
evaluated_rewards=evaluated_rewards,
time_budget_s=10,
num_samples=-1,
)
```
### Reproducibility
By default, there is randomness in our tuning process. If reproducibility is desired, you could
manually set a random seed before calling `tune.run()`. For example, in the following code, we call `np.random.seed(100)` to set the random seed.
With this random seed, running the following code multiple times will generate exactly the same search trajectory.
```python
import numpy as np
np.random.seed(100)
analysis = tune.run(
simple_obj,
config=config_search_space,
mode="max",
num_samples=10,
)
```
## Hyperparameter Optimization Algorithm
To tune the hyperparameters toward your objective, you will want to use a hyperparameter optimization algorithm which can help suggest hyperparameters with better performance (regarding your objective). `flaml` offers two HPO methods: CFO and BlendSearch. `flaml.tune` uses BlendSearch by default when the option [blendsearch] is installed.
<!-- ![png](images/CFO.png) | ![png](images/BlendSearch.png)
:---:|:---: -->
### CFO: Frugal Optimization for Cost-related Hyperparameters
CFO uses the randomized direct search method FLOW<sup>2</sup> with adaptive stepsize and random restart.
It requires a low-cost initial point as input if such point exists.
The search begins with the low-cost initial point and gradually move to
high cost region if needed. The local search method has a provable convergence
rate and bounded cost.
About FLOW<sup>2</sup>: FLOW<sup>2</sup> is a simple yet effective randomized direct search method.
It is an iterative optimization method that can optimize for black-box functions.
FLOW<sup>2</sup> only requires pairwise comparisons between function values to perform iterative update. Comparing to existing HPO methods, FLOW<sup>2</sup> has the following appealing properties:
1. It is applicable to general black-box functions with a good convergence rate in terms of loss.
1. It provides theoretical guarantees on the total evaluation cost incurred.
The GIFs attached below demonstrate an example search trajectory of FLOW<sup>2</sup> shown in the loss and evaluation cost (i.e., the training time ) space respectively. FLOW<sup>2</sup> is used in tuning the # of leaves and the # of trees for XGBoost. The two background heatmaps show the loss and cost distribution of all configurations. The black dots are the points evaluated in FLOW<sup>2</sup>. Black dots connected by lines are points that yield better loss performance when evaluated.
![gif](images/heatmap_loss_cfo_12s.gif) | ![gif](images/heatmap_cost_cfo_12s.gif)
:---:|:---:
From the demonstration, we can see that (1) FLOW<sup>2</sup> can quickly move toward the low-loss region, showing good convergence property and (2) FLOW<sup>2</sup> tends to avoid exploring the high-cost region until necessary.
Example:
```python
from flaml import CFO
tune.run(...
search_alg=CFO(low_cost_partial_config=low_cost_partial_config),
)
```
**Recommended scenario**: There exist cost-related hyperparameters and a low-cost
initial point is known before optimization.
If the search space is complex and CFO gets trapped into local optima, consider
using BlendSearch.
### BlendSearch: Economical Hyperparameter Optimization With Blended Search Strategy
BlendSearch combines local search with global search. It leverages the frugality
of CFO and the space exploration ability of global search methods such as
Bayesian optimization. Like CFO, BlendSearch requires a low-cost initial point
as input if such point exists, and starts the search from there. Different from
CFO, BlendSearch will not wait for the local search to fully converge before
trying new start points. The new start points are suggested by the global search
method and filtered based on their distance to the existing points in the
cost-related dimensions. BlendSearch still gradually increases the trial cost.
It prioritizes among the global search thread and multiple local search threads
based on optimism in face of uncertainty.
Example:
```python
# require: pip install flaml[blendsearch]
from flaml import BlendSearch
tune.run(...
search_alg=BlendSearch(low_cost_partial_config=low_cost_partial_config),
)
```
**Recommended scenario**: Cost-related hyperparameters exist, a low-cost
initial point is known, and the search space is complex such that local search
is prone to be stuck at local optima.
**Suggestion about using larger search space in BlendSearch**.
In hyperparameter optimization, a larger search space is desirable because it is more likely to include the optimal configuration (or one of the optimal configurations) in hindsight. However the performance (especially anytime performance) of most existing HPO methods is undesirable if the cost of the configurations in the search space has a large variation. Thus hand-crafted small search spaces (with relatively homogeneous cost) are often used in practice for these methods, which is subject to idiosyncrasy. BlendSearch combines the benefits of local search and global search, which enables a smart (economical) way of deciding where to explore in the search space even though it is larger than necessary. This allows users to specify a larger search space in BlendSearch, which is often easier and a better practice than narrowing down the search space by hand.
For more technical details, please check our papers.
* [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
```bibtex
@inproceedings{wu2021cfo,
title={Frugal Optimization for Cost-related Hyperparameters},
author={Qingyun Wu and Chi Wang and Silu Huang},
year={2021},
booktitle={AAAI'21},
}
```
* [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
```bibtex
@inproceedings{wang2021blendsearch,
title={Economical Hyperparameter Optimization With Blended Search Strategy},
author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},
year={2021},
booktitle={ICLR'21},
}
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 MiB