...
conda env as YAML (Python 3.9, 16.1.2024)
...
This allows to build a fully functional environment with Log2Vec based on Python 3.9. The original release was 3.6. There will be some deprecation warnings, but I believe they can be safely ignored.
...
Gister macro | ||||
---|---|---|---|---|
|
Code Block |
---|
curl https://gist.githubusercontent.com/norandom/a1fd048d7d870a90aa72c9c45fd44e02/raw/f8c6ad9c5470b5380d4bcea8eaa237dd64217f9d/conda_env_log2vec.yml -o log2vec_conda.yml
conda env create -f conda_env_log2vec.yml
conda activate log2vec
... # conda env gets stored in the user homes
git clone https://github.com/NetManAIOps/Log2Vec
# follow the steps |
...
This allows to use the Log2Vec library for automated log file vectorization based on the semantic embedding and NLP approach demonstrated in the paper.
Code:
Gister macro | ||||
---|---|---|---|---|
|
Understanding the .vector versus the .log
The format is line-based, with up to 32 vector dimensions (per line)
Code Block |
---|
marius@mleng:~/source/sample_logs$ wc -l syslog.log
12266 syslog.log
marius@mleng:~/source/sample_logs$ wc -l syslog.vector
12267 syslog.vector
marius@mleng:~/source/sample_logs$ head -n 1 syslog.vector
12266 32
|
A header will be added with the number of lines (samples) and the dimensions (32). Therefore, there is one additional line.
The vectors can be consumed by an ML pipeline.