INDEX
    Explanations

    names of political figures

    specific geographical or cultural identifiers and names

    New Auto-Interp
    Negative Logits
     envy
    -0.61
    OPLE
    -0.55
     kittens
    -0.55
    FACE
    -0.54
    staking
    -0.53
     condem
    -0.53
    lished
    -0.53
     notch
    -0.52
     readiness
    -0.50
     puzz
    -0.50
    POSITIVE LOGITS
    oli
    0.76
    ema
    0.75
    ak
    0.75
    ich
    0.74
    oz
    0.74
    am
    0.73
    rad
    0.72
    ana
    0.72
    aj
    0.71
    ar
    0.71
    Act Density 0.319%

    No Known Activations