INDEX
    Explanations

    proper nouns related to entities or individuals

    New Auto-Interp
    Negative Logits
     psi
    -0.80
     VOL
    -0.74
     Shelley
    -0.74
     USPS
    -0.73
     KN
    -0.70
     stre
    -0.69
     Volvo
    -0.68
     MU
    -0.68
     Schl
    -0.67
     Phill
    -0.67
    POSITIVE LOGITS
    ad
    1.77
    ads
    1.55
    AD
    1.42
    adic
    1.34
    adh
    1.34
    adan
    1.22
    adian
    1.18
    ada
    1.17
    adin
    1.17
    ador
    1.17
    Act Density 0.058%

    No Known Activations