INDEX
    Explanations

    proper nouns, particularly names of individuals

    New Auto-Interp
    Negative Logits
    abra
    -0.15
    agini
    -0.15
    itizen
    -0.15
    uong
    -0.14
    _utilities
    -0.14
    ë¥
    -0.14
    ritch
    -0.14
    rame
    -0.14
    inki
    -0.14
     Metallic
    -0.14
    POSITIVE LOGITS
    ibus
    0.16
     shadow
    0.14
    eck
    0.14
     Arms
    0.14
    sville
    0.14
    arpa
    0.14
    é¦
    0.13
    145
    0.13
    oz
    0.13
    ussy
    0.13
    Act Density 0.040%

    No Known Activations