Commit Graph

43 Commits

Author SHA1 Message Date
Susan Xueqing Liu
f01acb67f6 update model of text summarization (#1030) 2023-05-10 00:48:22 +00:00
Jirka Borovec
a701cd82f8 set black with 120 line length (#975)
* set black with 120 line length

* apply pre-commit

* apply black
2023-04-10 19:50:40 +00:00
Susan Xueqing Liu
ef5a17cd83 handling nlp divide by zero (#926)
* handling nlp divide by zero

* catching zerodivisionerror

* catching zerodivisionerror

* catching zerodivisionerror

* addressing comments

* addressing comments

* updating test case

* update

* add blank to last line

* update nlp notebook

* rerun

* rerun

* sync with main

* add model selection for nlg

* addressing keyerror

* add raise exception

* update

* fix bug

* revert

* updating automl_nlp

* Update flaml/automl/model.py

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>

* address comments

* address comments

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Zvi Baratz <z.baratz@gmail.com>
2023-04-09 16:53:30 +00:00
Jirka Borovec
2ff1035733 precommit: end-of-file-fixer (#929)
* precommit: end-of-file-fixer

* exclude .gitignore

* apply

---------

Co-authored-by: Shaokun <shaokunzhang529@gmail.com>
2023-02-28 16:27:14 +00:00
Jirka Borovec
6aa1d16ebc pre-commit: update config (#925)
* update config

* apply precommit
2023-02-22 00:49:38 +00:00
Mark Harley
44ddf9e104 Refactor into automl subpackage (#809)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* Pin pytorch-lightning to less than 1.8.0

We're seeing strange lightning related bugs from pytorch-forecasting
since the release of lightning 1.8.0. Going to try constraining this to
see if we have a fix.

* Fix the lightning version pin

Was optimistic with setting it in the 1.7.x range, but that isn't
compatible with python 3.6

* Remove lightning version pin

* Revert dependency version changes

* Minor change to retrigger the build

* Fix line endings in ml.py and model.py

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: EgorKraevTransferwise <egor.kraev@transferwise.com>
2022-12-06 15:46:08 -05:00
Li Jiang
2501b86444 fix typo of output directory (#828)
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-11-30 17:04:29 -08:00
Chi Wang
70d86942f4 skip test in py 3.6 (#832) 2022-11-29 13:10:35 -08:00
Chi Wang
595af7a04f install editable package in codespace (#826)
* install editable package in codespace

* fix test error in test_forecast

* fix test error in test_space

* openml version

* break tests; pre-commit

* skip on py10+win32

* install mlflow in test

* install mlflow in [test]

* skip test in windows

* import

* handle PermissionError

* skip test in windows

* skip test in windows

* skip test in windows

* skip test in windows

* remove ts_forecast_panel from doc
2022-11-27 14:22:54 -05:00
Chi Wang
30e200985c Fix issues related to zero-shot automl (#783)
* skip in-search-space check for small max iter

* resolve Pickle Transformer #730

* resolve default config unrecognized #784

* Change definition of init_config

* copy points_to_evaluate

* make test pass

* check learner selector
2022-11-13 12:47:59 -08:00
Susan Xueqing Liu
2ebddd67ae Remove NLP classification head (#756)
* rm classification head in nlp

* rm classification head in nlp

* rm classification head in nlp

* adding test cases for switch classification head

* adding test cases for switch classification head

* Update test/nlp/test_autohf_classificationhead.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* adding test cases for switch classification head

* run each test separately

* skip classification head test on windows

* disabling wandb reporting

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* Update website/docs/Examples/AutoML-NLP.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/Examples/AutoML-NLP.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* fix test nlp custom metric

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-10-12 17:04:42 -07:00
Xueqing Liu
ceb3e300cd Issue724 (#745)
* fixing issue724

* fixing issue724
2022-10-04 10:51:12 -04:00
Xueqing Liu
3d1a28bfc0 Add preserve_checkpoint to preserve the checkpoint after del (#692)
* fix del bug
2022-08-20 18:17:10 -04:00
Xueqing Liu
21fa6c10ec Fixing the issue that FLAML trial number is significantly smaller than Transformers.hyperparameter_search (#657)
* fix 636

* adding low cost config

* update padding; update tokenization output y type (series -> DF); update low cost init config

* updating todf; updating metric_loss_score
2022-08-03 00:11:29 -04:00
Xueqing Liu
5eb5d43d7f Fix HPO evaluation bug (#645)
* fix eval automl metric bug on val_loss inconsistency

* updating starting point search space to continuous

* shortening notebok
2022-07-28 23:08:42 -04:00
Xueqing Liu
731afec9eb This PR fixes the frequent NLP bugs in the other PRs (#647)
* fix nlp bug

* resetting model to electra small

* removing model_path from fit_kwargs_by_estimator
2022-07-25 17:46:33 -04:00
Qingyun Wu
b7846048dc Allow FLAML_sample_size in starting_points (#619)
* FLAML_sample_size

* clean up

* starting_points as a list

* catch AssertionError

* per estimator sample size

* import

* per estimator min_sample_size

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/automl/test_warmstart.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add warnings

* adding more tests

* fix a bug in validating starting points

* improve test

* revise test

* revise test

* documentation about custom_hp

* doc and efficiency

* update test

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-07-09 16:04:46 -04:00
Xueqing Liu
6108493e0b fix ner bug; refactor post processing of TransformersEstimator prediction (#615)
* fix ner bug; refactor post processing

* fix too many values to unpack

* supporting id/token label for NER
2022-07-05 13:38:21 -04:00
Chi Wang
cbb85e2aab Py36 (#614)
* allow installation in py 3.6

* test py 3.6
2022-06-26 08:32:28 -07:00
Chi Wang
49e8f7f028 use zeroshot when no budget is given; custom_hp (#563)
* use zeroshot when no budget is given; custom_hp

* update Getting-Started

* protobuf version

* X_val
2022-05-28 17:22:09 -07:00
Xueqing Liu
2a8decdc50 fix the post-processing bug in NER (#534)
* fix conll bug

* update DataCollatorForAuto

* adding label_list comments
2022-05-10 17:22:57 -04:00
Xueqing Liu
ca35fa969f refactoring TransformersEstimator to support default and custom_hp (#511)
* refactoring TransformersEstimator to support default and custom_hp

* handling starting_points not in search space

* addressing starting point more than max_iter

* fixing upper < lower bug
2022-04-28 14:06:29 -04:00
Chi Wang
9128c8811a handle failing trials (#505)
* handle failing trials

* clarify when to return {}

* skip ensemble in accuracy check
2022-03-28 16:57:52 -07:00
Xueqing Liu
72301b8568 fixing a few bugs in nlp (#503)
* fixing bugs in nlp
2022-03-26 14:08:51 -04:00
Xueqing Liu
5f97532986 adding evaluation (#495)
* adding automl.score

* fixing the metric name in train_with_config

* adding pickle after score

* fixing a bug in automl.pickle
2022-03-25 17:00:08 -04:00
Xueqing Liu
af423463c3 fixing bug for ner (#463)
* fixing bug for ner

* removing global var

* adding class for trial counter

* adding notebook

* adding use_ray dict

* updating documentation for nlp
2022-03-20 22:03:02 -04:00
Chi Wang
b4d312412a bump ray version to 1.10 (#450)
* bump ray version to 1.10

* init ray in test

* Update setup.py to include hotfixes

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-02-09 15:04:29 -08:00
Chi Wang
6960a833ec Gpu support for xgboost (#442)
* xgboost gpu support

* test xgboost gpu

* test sparse data

* add xgboost test

* remove ray.init to avoid pytest error
2022-01-30 13:02:18 -08:00
Xueqing Liu
438ccaa0c9 adding catch for HTTP error (#432) 2022-01-29 22:53:32 -08:00
Xueqing Liu
4814091d87 remove redundant imports (#426)
* remove redundant imports

* getting ride of hf dataset
2022-01-24 14:24:14 -08:00
Xueqing Liu
dda4ac90a1 moving intermediate_results logging from model.py to huggingface/trainer.py (#403)
* replacing val_loss with automl_metric
2022-01-14 17:26:10 -08:00
Chi Wang
569908fbe6 fix issues in logging, bug in space.py, constraint sign, and improve code coverage (#388)
* console log handler

* version update

* doc

* skippable steps

* notebook update

* constraint sign

* doc for constraints

* bug fix: define-by-run and unflatten_hierarchical

* const

* handle nested space in indexof()

* test grid search

* test suggestion

* model test

* >1 ckpts

* always increase iter count

* log total # iterations

* security patch

* make iter_per_learner consistent
2022-01-14 13:39:09 -08:00
Xueqing Liu
f41f1c2198 Logging multiple checkpoints (#394) 2022-01-12 19:50:39 -08:00
Xueqing Liu
bd66e40296 fixing load best model at the end (#389) 2022-01-11 10:47:53 -08:00
Chi Wang
612668e8ed serialize TransformerEstimator (#381)
* serialize TransformerEstimator

* check has_attr

* custom metric needs trainer

* skip test on mac
2022-01-06 10:28:19 -08:00
Xueqing Liu
207b6935d9 adding token classification (#376)
* adding ner
2022-01-03 13:44:10 -05:00
oberonbot
9c00e4272a Finish the Multiple Choice Classification (#367)
* adding multiple choice

* update test cases (hard coded)

* merged common code in predict_proba and predict in TransformersEstimator
2022-01-02 20:12:34 -05:00
Xueqing Liu
b2900f4b22 fixing custom metric (#357)
* fixing the error for custom metric
2021-12-24 16:23:09 -05:00
Xueqing Liu
dcfd218108 Fixing the bug in custom metric (#356)
* fixing the bug for custom metric
2021-12-23 18:44:53 -05:00
Xueqing Liu
ee3162e232 Adding the NLP task summarization (#346)
* Add test_autohf_summarization.py

* adding seq2seq

* Update flaml/nlp/huggingface/trainer.py

* rouge metrics

Co-authored-by: XinZofStevens <xzhao4346@gmail.com>
Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-20 14:19:32 -08:00
Xueqing Liu
fb59bb9928 adding TODOs for NLP module, so students can implement other tasks easier (#321)
* fixing ray pickle bug, skipping macosx bug, completing code for seqregression

* catching connectionerror

* ading TODOs for NLP module
2021-12-03 12:45:16 -05:00
Xueqing Liu
fd136b02d1 bug fix for TransformerEstimator (#293)
* fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode

* finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1

* adding predict_proba, address PR 293's comments

close #293 #291
2021-11-23 11:26:39 -08:00
Chi Wang
72caa2172d model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283)
if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO.
retrain if using ray.
update ITER_HP in config after a trial is finished.
change prophet logging level.
example and notebook update.
allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class.
remove model_history.
checkpoint bug fix.

* model_history meaning save_best_model_per_estimator

* ITER_HP

* example update

* prophet logging level

* comment update in forecast notebook

* print format improvement

* allow settings to be passed to AutoML constructor

* checkpoint bug fix

* time limit for autohf regression test

* skip slow test on macos

* cleanup before del
2021-11-18 09:39:45 -08:00