INDEX
    Explanations

    phrases that emphasize contradiction or irony

    New Auto-Interp
    Negative Logits
    ListItemIcon
    -0.55
     Ac
    -0.51
    fiés
    -0.49
    Santis
    -0.47
    اعم
    -0.47
    lis
    -0.46
     tủ
    -0.45
    المكان
    -0.44
    noons
    -0.44
    ISupport
    -0.43
    POSITIVE LOGITS
     words
    1.65
    words
    1.26
     Words
    1.23
    Words
    1.22
     WORDS
    1.20
     słowa
    1.12
     word
    1.10
     woorden
    1.08
     palabras
    1.06
    WORDS
    1.01
    Act Density 0.193%

    No Known Activations