INDEX
    Explanations

    personal traits or characteristics of individuals, especially related to appearance or behavior

    elements related to complex character traits and emotional experiences

    New Auto-Interp
    Negative Logits
    Sav
    -0.64
    yss
    -0.60
    idth
    -0.60
     stewards
    -0.58
    imar
    -0.57
    ariat
    -0.54
     Panama
    -0.54
    Americ
    -0.53
    olon
    -0.53
    odore
    -0.53
    POSITIVE LOGITS
    *.
    1.16
    !.
    1.00
    .*
    0.95
    .(
    0.91
     ;)
    0.91
    .[
    0.89
     ðŁĻĤ
    0.89
    +.
    0.88
     haha
    0.88
     thanks
    0.88
    Act Density 0.879%

    No Known Activations