INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Ma
    -0.09
    tok
    -0.09
     Oost
    -0.08
     woman's
    -0.08
    午夜
    -0.08
     vrouwelijke
    -0.08
     ATL
    -0.08
    	es
    -0.08
     Maaari
    -0.07
     Lancashire
    -0.07
    POSITIVE LOGITS
     dévo
    0.07
    Calling
    0.07
    0.07
    0.07
    Shoot
    0.07
    balanced
    0.07
    Fold
    0.07
     shooter
    0.07
    State
    0.07
     State
    0.07
    Act Density 0.000%

    No Known Activations