INDEX
Explanations
expressions of personal experience and emotional reflection
New Auto-Interp
Negative Logits
áp
-0.08
itself
-0.08
bilt
-0.07
ousand
-0.07
lopen
-0.07
ارس
-0.07
tti
-0.07
lô
-0.07
ilet
-0.07
reck
-0.07
POSITIVE LOGITS
'm
0.11
’m
0.11
've
0.10
’ve
0.09
'll
0.09
am
0.08
zelf
0.08
’ll
0.08
zzo
0.08
myself
0.07
Activations Density 0.293%