INDEX
    Explanations

    violence, gore

    New Auto-Interp
    Negative Logits
    isan
    -0.07
    LT
    -0.06
     rightly
    -0.06
    ANNEL
    -0.06
     readline
    -0.06
    WND
    -0.06
    HTML
    -0.06
     Proud
    -0.06
    	js
    -0.06
     lines
    -0.05
    POSITIVE LOGITS
    ilitary
    0.07
    428
    0.07
    0.07
     nữa
    0.07
     группы
    0.07
     havoc
    0.07
     potrav
    0.06
    0.06
    949
    0.06
     AUT
    0.06
    Act Density 0.005%

    No Known Activations