INDEX
    Explanations

    informal expressions of humor or sarcasm

    New Auto-Interp
    Negative Logits
    ьаж
    -0.57
     iVar
    -0.56
     发表于
    -0.54
    ГЛА
    -0.52
    μερα
    -0.51
    MessageTagHelper
    -0.48
     محفوظة
    -0.48
    FUCK
    -0.47
    kově
    -0.47
    ignez
    -0.47
    POSITIVE LOGITS
     lol
    1.72
     LOL
    1.69
     haha
    1.65
     hahaha
    1.52
     ;-)
    1.51
     Lol
    1.49
     🤣
    1.48
    LOL
    1.48
     hehe
    1.45
     😂
    1.45
    Act Density 0.362%

    No Known Activations