INDEX
Explanations
proper names of individuals
mentions of specific names, particularly "Sharon" and "Schwartz."
New Auto-Interp
Negative Logits
ttle
-0.89
ntil
-0.85
cci
-0.75
¥
-0.75
¤
-0.74
ways
-0.73
teen
-0.73
ures
-0.71
lass
-0.71
ħ
-0.71
POSITIVE LOGITS
MENT
0.70
burg
0.68
MENTS
0.66
Metatron
0.63
guiActiveUnfocused
0.61
Ezek
0.61
rabbi
0.61
mallow
0.60
IUM
0.59
absor
0.59
Activations Density 0.033%