INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    cess
    -0.74
    FACE
    -0.69
    cellence
    -0.66
    iencies
    -0.65
    ifice
    -0.64
    icans
    -0.63
    NEWS
    -0.62
    blem
    -0.62
    ican
    -0.62
     Colleges
    -0.62
    POSITIVE LOGITS
     Friedrich
    0.99
    stad
    0.91
    sson
    0.88
     Frankfurt
    0.81
     Sabha
    0.77
     Gaal
    0.76
     Ing
    0.75
    vik
    0.75
     Ernst
    0.74
    etter
    0.74
    Act Density 0.044%

    No Known Activations