INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     гориз
    -0.09
    _cursor
    -0.08
     Benef
    -0.08
     submet
    -0.08
     Krit
    -0.08
     लाभ
    -0.08
     Stake
    -0.07
     discíp
    -0.07
    -0.07
    -0.07
    POSITIVE LOGITS
     profanity
    0.13
     marijuana
    0.10
     языка
    0.09
     слово
    0.09
     langue
    0.09
     obscene
    0.09
     muff
    0.08
    -color
    0.08
     loudly
    0.08
     cannabis
    0.08
    Act Density 0.012%

    No Known Activations