INDEX
    Explanations

    mentions of public figures involved in controversial statements or actions related to societal issues

    New Auto-Interp
    Negative Logits
    ÑĩаÑĤ
    -0.15
     kuÅŁ
    -0.14
    overs
    -0.14
     eldre
    -0.14
    uis
    -0.14
    nou
    -0.13
    çıŃ
    -0.13
    loys
    -0.13
    arez
    -0.13
    NEY
    -0.13
    POSITIVE LOGITS
    yun
    0.15
    batim
    0.15
    prm
    0.15
     vide
    0.15
     Sof
    0.14
    icter
    0.14
    ãģ°ãģĭãĤĬ
    0.14
    lap
    0.14
    ogl
    0.13
     Sk
    0.13
    Act Density 0.179%

    No Known Activations