INDEX
Explanations
expressions related to emotional states and sensations
New Auto-Interp
Negative Logits
../../../
-0.17
itals
-0.15
esse
-0.15
unidad
-0.15
doi
-0.14
esian
-0.14
ophone
-0.14
mun
-0.14
appe
-0.14
ello
-0.14
POSITIVE LOGITS
-good
0.27
ings
0.24
sorry
0.22
Sorry
0.19
lessly
0.19
sorry
0.18
inspace
0.18
-safe
0.17
INGS
0.17
good
0.17
Activations Density 0.041%