INDEX
    Explanations

    negative sentiment expressions

    New Auto-Interp
    Negative Logits
     myſelf
    -0.89
     pleaſure
    -0.86
     חיצוניים
    -0.82
     purpoſe
    -0.81
     ―――――
    -0.79
     ſy
    -0.77
    -0.77
     diſt
    -0.75
     himſelf
    -0.74
     raiſ
    -0.74
    POSITIVE LOGITS
    0.55
    enumi
    0.55
    stdc
    0.47
    tomat
    0.45
    _
    0.45
    __':
    0.43
    Utf
    0.41
    ֔
    0.41
     $
    0.41
    elett
    0.41
    Act Density 0.070%

    No Known Activations