INDEX
Explanations
phrases indicating emotional or physical struggles and conflicts
New Auto-Interp
Negative Logits
793
-0.16
aight
-0.16
alley
-0.15
à¸Ńà¸ļ
-0.14
.managed
-0.14
792
-0.14
taire
-0.14
hower
-0.14
ushman
-0.14
819
-0.14
POSITIVE LOGITS
atern
0.16
Wo
0.15
URA
0.15
ayout
0.14
ÅŁt
0.14
ula
0.14
military
0.14
?=.*
0.14
@js
0.14
endo
0.14
Activations Density 0.015%