INDEX
    Explanations

    references to groups or categories of individuals or entities

    New Auto-Interp
    Negative Logits
    ple
    -0.06
    (OP
    -0.06
     sociálnÃŃ
    -0.06
    æ§
    -0.06
    ÑĥлÑıÑĢ
    -0.06
    aque
    -0.06
     Guys
    -0.06
     Strikes
    -0.06
    iti
    -0.06
    zer
    -0.06
    POSITIVE LOGITS
     alike
    0.17
     similar
    0.11
     others
    0.10
     simil
    0.09
    others
    0.09
    similar
    0.08
    aille
    0.08
     comparable
    0.08
     similarly
    0.07
    odb
    0.07
    Act Density 0.031%

    No Known Activations