INDEX
    Explanations

    references to specific names, potentially related to people or places

    proper nouns, specifically names related to individuals or characters

    New Auto-Interp
    Negative Logits
    Ŀ
    -0.70
    ¤
    -0.68
    selves
    -0.67
     Cortana
    -0.64
     Lisbon
    -0.64
    ĩ
    -0.63
    tics
    -0.62
    pron
    -0.62
    ı
    -0.61
     WATCHED
    -0.61
    POSITIVE LOGITS
    bard
    1.25
     Roth
    1.13
    roth
    0.99
    stein
    0.98
    haar
    0.95
    igan
    0.89
    punk
    0.88
    schild
    0.88
    heit
    0.87
    leigh
    0.87
    Act Density 0.009%

    No Known Activations