INDEX
Explanations
words related to personas or individuals, specifically names like "Sophie," "Sophia," or "Sofia."
references to the name "Sophie" and its variations
New Auto-Interp
Negative Logits
FORM
-0.71
eed
-0.65
BAT
-0.65
âĸ¬âĸ¬
-0.62
hips
-0.60
redo
-0.59
forth
-0.59
LESS
-0.58
ARDS
-0.58
MENTS
-0.57
POSITIVE LOGITS
sylv
0.91
osph
0.90
otos
0.86
aea
0.84
inx
0.81
ongyang
0.79
osure
0.79
sta
0.77
ofer
0.77
phony
0.76
Activations Density 0.038%