INDEX
    Explanations

    phrases that express claims, assertions, or attributions

    New Auto-Interp
    Negative Logits
    rema
    -0.15
    IFO
    -0.15
     Asi
    -0.15
    ä½į
    -0.15
     bankrupt
    -0.14
     Ara
    -0.14
    MOTE
    -0.14
    lev
    -0.14
    LEV
    -0.14
    inea
    -0.14
    POSITIVE LOGITS
    oho
    0.16
    اع
    0.16
    eker
    0.15
     Wolff
    0.15
     Tic
    0.15
    umi
    0.15
     precious
    0.15
    è¨ĵ
    0.14
     Albania
    0.14
     Eig
    0.14
    Act Density 0.217%

    No Known Activations