INDEX
    Explanations

    proper nouns, particularly names of people

    names ending in certain characters

    New Auto-Interp
    Negative Logits
     læng
    -0.53
    󠁮
    -0.52
     General
    -0.50
    󠁬
    -0.47
     køb
    -0.47
    <caption>
    -0.45
     출
    -0.45
     BorderSide
    -0.44
     stør
    -0.43
     offent
    -0.42
    POSITIVE LOGITS
     rhestr
    0.54
     protoimpl
    0.45
     (@
    0.44
    @
    0.43
    Jeografia
    0.43
    Šaltiniai
    0.43
    حياتها
    0.43
     dAtA
    0.41
    yt
    0.41
     PeEnEo
    0.41
    Act Density 0.071%

    No Known Activations