Configuration¶
Configuration is statically defined in a YAML file. The application reads
anomdec.yml
from $ANOMDEC_HOME
path. The following section describes
this file structure.
streams¶
Streams
is the main section of this file. It’s a named list that defines how
is a signal processed.
source / engine¶
It has two required sections, the source
and the engine
This is the minimal configuration to start processing signals but, at this point
we are not persisting the result.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | version: 1
streams:
- name: my_kafka_source_one
source:
type: kafka
params:
broker_servers: localhost:9092
input_topic: test1
engine:
type: robust
params:
window: 30
threshold: 0.9999
|
sink¶
To persist the result we need to add a sink
configuration section. This can
be a list of sinks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | version: 1
streams:
- name: my_kafka_source_one
source:
type: kafka
params:
broker_servers: localhost:9092
input_topic: test1
engine:
type: robust
params:
window: 30
threshold: 0.9999
sink:
- name: sqlite
type: repository
repository:
type: sqlite
params:
database: /tmp/my_kafka_source_one.sqlite
- name: kafka
type: stream
stream:
type: kafka
params:
broker_servers: localhost:9092
output_topic: test2
|
warmup¶
warmup
section has two roles, the first is to be used to warm up the
engine before starting making predictions. The second one is to make the data
accessible from the dashboard to visualize it. We will define a warmup
configuration section with one repository
that is also used in sink
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | version: 1
streams:
- name: my_kafka_source_one
source:
type: kafka
params:
broker_servers: localhost:9092
input_topic: test1
engine:
type: robust
params:
window: 30
threshold: 0.9999
sink:
- name: sqlite
type: repository
repository:
type: sqlite
params:
database: /tmp/my_kafka_source_one.sqlite
- name: kafka
type: stream
stream:
type: kafka
params:
broker_servers: localhost:9092
output_topic: test2
warmup:
- name: sqlite
repository:
type: sqlite
params:
database: /tmp/my_kafka_source_one.sqlite
|
repository¶
Repository section could be found in sink
and in warmup
sections, it
defines an storage backend that is supported by BaseSink
implementations,
RepositorySink
and ObservableRepository
respectively.
That sink
repository could be also used as warmup
to warm up the model
in case of a model that require previous data to evaluate new data. Although it
is defined as a list, only the first element will be used to warm up the model.
1 2 3 4 | stream:
type: kafka
params:
broker_servers: localhost:9092
|
websocket¶
There is a websocket
section that is used to send output ticks
to the
dashboard. This allows to update dashboard plots in realtime.
1 | websocket: ws://localhost:5000/ws/
|
Example configuration file¶
anomdec.yml¶
A full example configuration. This configuration reflects a full message flow reading from a kafka broker, processing with robust detector warmed up with the same repository that persists the output.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | version: 1
websocket: ws://localhost:5000/ws/
streams:
- name: my_kafka_source_one
source:
type: kafka
params:
broker_servers: localhost:9092
input_topic: test1
engine:
type: robust
params:
window: 30
threshold: 0.9999
sink:
- name: sqlite
type: repository
repository:
type: sqlite
params:
database: /tmp/my_kafka_source_one.sqlite
- name: kafka
type: stream
stream:
type: kafka
params:
broker_servers: localhost:9092
output_topic: test2
warmup:
- name: sqlite
repository:
type: sqlite
params:
database: /tmp/my_kafka_source_one.sqlite
|
diagram¶
Here it is a diagram that represents the full configuration file. We can see that the output of the engine could be sinked to a repository and to an streaming system to visualize and react for anomalies, and is also used to warm up the engine in case of restart or failure.