INDEX
    Explanations

    proper nouns, particularly names of individuals

    New Auto-Interp
    Negative Logits
    acle
    -0.18
    .datab
    -0.18
    ismet
    -0.16
     lapse
    -0.16
    046
    -0.15
    oka
    -0.14
     prova
    -0.14
    oken
    -0.14
    uka
    -0.14
    ãĥ¬ãĥĥãĥĪ
    -0.14
    POSITIVE LOGITS
     sam
    0.23
     Sam
    0.21
    Sam
    0.16
     SAM
    0.16
     Bulk
    0.15
     Samantha
    0.15
     ساÙħ
    0.15
    SAM
    0.15
     mic
    0.15
    iform
    0.14
    Act Density 0.016%

    No Known Activations