INDEX
    Explanations

    proper nouns, specifically names and titles

    New Auto-Interp
    Negative Logits
    bbe
    -0.17
    lej
    -0.16
     à¹Ģà¸ķ
    -0.15
    ensa
    -0.15
    ObjectContext
    -0.15
    reon
    -0.14
    pite
    -0.14
    ιÏĩ
    -0.14
    oldur
    -0.14
    ltra
    -0.14
    POSITIVE LOGITS
    akov
    0.18
    ka
    0.17
     Jones
    0.14
    hardt
    0.14
    .,
    0.14
     Champagne
    0.14
    寸
    0.14
     sol
    0.14
    orf
    0.14
    olt
    0.14
    Act Density 0.183%

    No Known Activations