INDEX
    Explanations

    expressions that indicate contradiction or contrasting statements

    New Auto-Interp
    Negative Logits
     يتيمه
    -0.56
    cektir
    -0.54
     lenker
    -0.53
     Theſe
    -0.52
    xFFFFFFFF
    -0.51
    GHIJKLM
    -0.51
    أما
    -0.50
     mukana
    -0.50
    первых
    -0.50
     Vordergrund
    -0.47
    POSITIVE LOGITS
     yet
    1.05
    Yet
    0.93
    yet
    0.90
     Trotzdem
    0.87
     pourtant
    0.87
     Yet
    0.86
     despite
    0.80
    Trotz
    0.79
     YET
    0.79
     nevertheless
    0.79
    Act Density 0.286%

    No Known Activations