INDEX
    Explanations

    names of individuals or organizations

    word forms indicating names or identifiers of individuals

    New Auto-Interp
    Negative Logits
     shorth
    -0.71
     subscript
    -0.66
    Scotland
    -0.60
     demoral
    -0.59
    Tokens
    -0.57
     plural
    -0.56
    Prim
    -0.56
    FIX
    -0.56
     overwhelming
    -0.55
     corrid
    -0.55
    POSITIVE LOGITS
     Jr
    1.10
    oulos
    1.01
    oglu
    1.00
    enegger
    0.96
    opoulos
    0.92
    owski
    0.90
    zyk
    0.88
     III
    0.88
    ewski
    0.86
     Sr
    0.86
    Act Density 0.338%

    No Known Activations