INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     Este
    -0.06
    -dashboard
    -0.06
     вуз
    -0.06
     MAIL
    -0.06
     BU
    -0.06
    -0.06
    .feedback
    -0.06
    _F
    -0.06
     Yeni
    -0.06
    POSITIVE LOGITS
     alcohol
    0.13
     Alcohol
    0.12
    cohol
    0.09
    0.08
     Scot
    0.07
     закон
    0.07
    位于
    0.07
     prostitutes
    0.07
     lubric
    0.07
    .VisualStudio
    0.06
    Act Density 0.007%

    No Known Activations