INDEX
    Explanations

    mentions of sports teams

    New Auto-Interp
    Negative Logits
    affe
    -0.17
    oin
    -0.14
     oneself
    -0.14
    à¥ģà¤Ń
    -0.14
    cel
    -0.14
     blot
    -0.14
    rade
    -0.14
    bj
    -0.13
    illac
    -0.13
    heid
    -0.13
    POSITIVE LOGITS
    ichern
    0.18
    ichel
    0.15
    åŃĹå¹ķ
    0.15
    ÑĤож
    0.15
    bris
    0.14
    InstanceState
    0.14
    elper
    0.14
    pecies
    0.13
     Bilg
    0.13
    hx
    0.13
    Act Density 0.040%

    No Known Activations