INDEX
    Explanations

    emojis and special characters

    New Auto-Interp
    Negative Logits
     evidence
    -0.63
    Åį
    -0.62
     wages
    -0.59
    outh
    -0.57
     substance
    -0.56
    âĢij
    -0.56
    ousing
    -0.56
     acute
    -0.56
     Eight
    -0.56
     advanced
    -0.55
    POSITIVE LOGITS
     ;)
    3.39
     :)
    3.37
     ðŁĻĤ
    3.31
     :-)
    3.16
     ðŁĺ
    3.01
     haha
    2.34
     :(
    2.17
     lol
    1.91
     XD
    1.71
     LOL
    1.67
    Act Density 0.025%

    No Known Activations