Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

gist.githubusercontent.com/norandom/86a701a56b7de8c800a83eac293da813/raw/a9c7db1d46be633f344b4a07ff05d8985530b162/log2vec_wrapper.sh

Understanding the .vector versus the .log

The format is line-based, with up to 32 vector dimensions (per line)

Code Block
marius@mleng:~/source/sample_logs$ wc -l syslog.log 
12266 syslog.log
marius@mleng:~/source/sample_logs$ wc -l syslog.vector 
12267 syslog.vector
marius@mleng:~/source/sample_logs$ head -n 1 syslog.vector 
12266 32

A header will be added with the number of lines (samples) and the dimensions (32). Therefore, there is one additional line.

The vectors can be consumed by an ML pipeline.