INDEX
    Explanations

    proper nouns, specifically names of individuals or entities

    proper nouns, specifically names of people and organizations

    New Auto-Interp
    Negative Logits
    ires
    -0.82
    auga
    -0.78
    ECH
    -0.74
     Jakarta
    -0.71
     Samantha
    -0.70
    onga
    -0.70
    urtle
    -0.70
    ansas
    -0.69
    IRED
    -0.69
     Armenia
    -0.69
    POSITIVE LOGITS
     Frey
    0.90
    swer
    0.81
    _{
    0.80
    flush
    0.80
    cipled
    0.79
    Reply
    0.77
    vous
    0.75
    ezvous
    0.75
    interstitial
    0.75
    tag
    0.74
    Act Density 0.016%

    No Known Activations