INDEX
    Explanations

    instances and references to specific individuals or groups, particularly in contexts related to influence and actions in cultural or societal discussions

    New Auto-Interp
    Negative Logits
    λει
    -0.16
    uce
    -0.15
    roke
    -0.15
    uis
    -0.15
    rray
    -0.15
    istrovstvÃŃ
    -0.14
    erge
    -0.14
    ogui
    -0.14
    XF
    -0.14
    ÑĥÑĢÑĥ
    -0.14
    POSITIVE LOGITS
    sein
    0.20
    iture
    0.17
    983
    0.14
     heure
    0.14
     etc
    0.14
    ANGUAGE
    0.14
     inn
    0.14
    USR
    0.14
     Wander
    0.14
    DateTime
    0.14
    Act Density 0.136%

    No Known Activations