INDEX
    Explanations

    expressions of gratitude and appreciation

    Hashtags and social media symbols

    exclamations on social media

    New Auto-Interp
    Negative Logits
     –,
    -0.89
    */;
    -0.80
     —,
    -0.80
    ,–
    -0.79
    .",
    
    -0.78
    —,
    -0.77
    ")));
    
    -0.76
    . 
    -0.75
    <?
    
    -0.74
    </caption>
    -0.73
    POSITIVE LOGITS
     #
    1.20
     @
    1.13
    #
    0.92
    @
    0.69
     &
    0.64
     \#
    0.60
    Repost
    0.59
     pic
    0.59
    ...@
    0.59
     (@
    0.58
    Act Density 0.075%

    No Known Activations