INDEX
Explanations
references to the name "Philip" or variations of it in different contexts
New Auto-Interp
Negative Logits
ertura
-0.17
vette
-0.16
alom
-0.16
"":
-0.16
441
-0.15
iliar
-0.15
mittel
-0.15
ạc
-0.14
uteur
-0.14
itten
-0.14
POSITIVE LOGITS
pe
0.28
ipp
0.21
ppe
0.20
pon
0.19
ps
0.19
ippi
0.18
pos
0.17
pei
0.16
inct
0.16
ines
0.15
Activations Density 0.007%