INDEX
Explanations
expressions of satisfaction or dissatisfaction
expressions of satisfaction or contentment
New Auto-Interp
Negative Logits
GOODMAN
-0.87
teasp
-0.79
ghai
-0.71
akin
-0.70
ãĤ¼ãĤ¦ãĤ¹
-0.69
ourage
-0.69
hops
-0.65
sites
-0.65
fighter
-0.64
ftime
-0.64
POSITIVE LOGITS
regards
1.19
standing
1.11
regard
1.00
impunity
0.93
respect
0.89
stood
0.86
dignity
0.80
drawn
0.78
draw
0.75
what
0.74
Activations Density 0.061%