INDEX
Explanations
mentions of the name "Eliza."
New Auto-Interp
Negative Logits
Evet
-0.17
Crus
-0.16
utzer
-0.15
unt
-0.15
го
-0.14
ikut
-0.14
apon
-0.14
efe
-0.14
ajes
-0.14
etsk
-0.14
POSITIVE LOGITS
odie
0.25
iza
0.22
ise
0.22
ton
0.19
isa
0.19
aine
0.19
uned
0.18
ż
0.18
leanor
0.18
ahi
0.17
Activations Density 0.011%