INDEX
Explanations
phrases indicating direct communication and assistance
New Auto-Interp
Negative Logits
ehler
-0.19
eniz
-0.16
اÙĦات
-0.16
801
-0.15
opoulos
-0.15
sodom
-0.15
ussen
-0.15
Leak
-0.14
.med
-0.14
gae
-0.14
POSITIVE LOGITS
Modifiers
0.15
brides
0.14
upos
0.14
quis
0.14
lim
0.14
wed
0.14
/functions
0.14
liÄį
0.14
wed
0.13
hon
0.13
Activations Density 0.000%