INDEX
    Explanations

    references to nationalities or ethnic identities

    New Auto-Interp
    Negative Logits
    .getRaw
    -0.16
    FRING
    -0.16
    antha
    -0.16
    enschaft
    -0.15
    .ease
    -0.15
    arium
    -0.14
    wort
    -0.14
    poon
    -0.14
    Ĺi
    -0.14
    Ø´ÙħاÙĦÛĮ
    -0.14
    POSITIVE LOGITS
    RP
    0.16
     surrounds
    0.15
    enet
    0.15
     dr
    0.15
    fp
    0.15
    ibold
    0.14
     obstruct
    0.14
    ROTO
    0.14
     combust
    0.14
    FP
    0.14
    Act Density 0.122%

    No Known Activations