INDEX
    Explanations

    phrases related to promises or commitments to end certain practices or conditions

    New Auto-Interp
    Negative Logits
    LETTE
    -0.15
     Pais
    -0.14
     iris
    -0.14
    alth
    -0.14
    vette
    -0.14
    antd
    -0.14
    locker
    -0.14
    udget
    -0.13
    ses
    -0.13
    065
    -0.13
    POSITIVE LOGITS
    ervas
    0.16
    ear
    0.16
    oj
    0.16
    หย
    0.15
    illard
    0.15
    ely
    0.14
    pto
    0.14
    icon
    0.14
     putas
    0.14
    ikh
    0.13
    Act Density 0.023%

    No Known Activations