Previousely, we were sending parsed benchmark results to a
Prometheus instance. Do to its time-series nature, Prometheus would
downsample database content to avoid having to much data points
for a given range of time. While this behavior is good for a
continuous stream of data, like monitoring CPU load, it's not suited
for benchmarks. Indeed benchmarks are discrete events that would
occurr once in a while (i.e once a day). Downsampling would, at
some point, simply omit some of benchmarks results.
Using a regular SQL database like PostgreSQL solves this issue.
Bench just one compilation option for automatic benchmarks. Only 'loop'
option is tested to take advantage of hardware with a lot of available
CPUs. Running benchmarks with 'default' option is suboptimal for this
kind of hardware since it uses only one CPU.
This also remove time consuming MNIST test, as it should be in ML benchmarks.
Moreover Makefile is fixed to use provided Python executable instead of
relying on system one to generate MLIR Yaml files.
This includes several fixes and add some functionalities:
* EC2 instance type can be selected when workflow is triggered manually
* benchmarks will be run on each push on main branch
* docker is not used any more due to building issues
* 10 repetitions are made during the benchmarks then results are aggregated
* more tags are used to identify benchmarks configuration