INDEX
    Explanations

    expressions of recommendation or endorsement

    New Auto-Interp
    Negative Logits
    ÑĢез
    -0.16
    tero
    -0.16
    lom
    -0.15
    éīĦ
    -0.14
    akte
    -0.14
    ken
    -0.14
     Basket
    -0.14
    uzzi
    -0.14
    alom
    -0.14
     agreed
    -0.13
    POSITIVE LOGITS
     kepada
    0.20
     anybody
    0.20
     unto
    0.20
     anyone
    0.19
    atory
    0.15
    à¹ģà¸ģ
    0.14
    ÃŃny
    0.14
    ùy
    0.14
     Anyone
    0.14
     anytime
    0.14
    Act Density 0.047%

    No Known Activations