INDEX
    Explanations

    phrases related to conflict or violence

    New Auto-Interp
    Negative Logits
    EMALE
    -0.17
    ue
    -0.16
    igin
    -0.15
    .Require
    -0.15
    iev
    -0.14
    ulan
    -0.14
     подоб
    -0.14
    _ALWAYS
    -0.14
     оÑģобенно
    -0.14
     SUCH
    -0.13
    POSITIVE LOGITS
     something
    0.16
     either
    0.15
     somebody
    0.15
     ._
    0.15
     ****************************************************************************
    0.14
    _suite
    0.14
    rowable
    0.14
     someone
    0.14
     pretty
    0.14
     nada
    0.13
    Act Density 0.146%

    No Known Activations