INDEX
    Explanations

    proper nouns, particularly names of people and places

    New Auto-Interp
    Negative Logits
    ADF
    -0.15
    ÑĢана
    -0.15
    claimer
    -0.15
    itin
    -0.15
     habit
    -0.14
     redesign
    -0.14
     kim
    -0.14
    è¼ķ
    -0.13
    %A
    -0.13
     harm
    -0.13
    POSITIVE LOGITS
    алом
    0.15
    ubat
    0.15
    å̼
    0.15
     VALUE
    0.14
    _PACK
    0.14
     Ulus
    0.14
     naï
    0.14
    )value
    0.14
    åĿIJ
    0.14
    afone
    0.14
    Act Density 0.022%

    No Known Activations