INDEX
    Explanations

    proper nouns or names

    New Auto-Interp
    Negative Logits
     Willis
    -0.94
    Mont
    -0.88
    978
    -0.85
     Mont
    -0.79
     Wynne
    -0.78
     Truman
    -0.76
     Painter
    -0.75
     neutron
    -0.75
     Polk
    -0.73
    Mel
    -0.72
    POSITIVE LOGITS
    ig
    1.44
    igs
    1.42
    IG
    1.40
    ags
    1.32
    ag
    1.31
    og
    1.31
     Sag
    1.21
     Sig
    1.18
    OG
    1.16
    AG
    1.14
    Act Density 0.352%

    No Known Activations