INDEX
    Explanations

    punctuation marks and emoticons used in informal communication

    New Auto-Interp
    Negative Logits
    NUMX
    -1.09
    )";
    
    -0.92
    />";
    -0.91
    "],
    
    -0.88
    `,
    
    -0.87
    '],
    
    -0.87
    ]";
    -0.87
     المعيارى
    -0.86
    )"),
    -0.85
    ]")
    -0.84
    POSITIVE LOGITS
     Sorry
    0.55
    lol
    0.51
     :
    0.49
    haha
    0.49
    G
    0.48
    ah
    0.47
    gh
    0.47
     sorry
    0.46
    Sorry
    0.46
    W
    0.46
    Act Density 0.206%

    No Known Activations