INDEX
Explanations
names of people
pronouns and names that indicate people or characters in a narrative
New Auto-Interp
Negative Logits
ourced
-0.70
pregn
-0.64
ources
-0.64
BASE
-0.63
Dragonbound
-0.63
fencing
-0.61
Redd
-0.60
VW
-0.60
ILCS
-0.59
runes
-0.59
POSITIVE LOGITS
auga
0.82
hower
0.72
arat
0.72
inite
0.68
amines
0.67
ghan
0.65
onen
0.64
nown
0.64
rag
0.63
teenth
0.62
Activations Density 0.129%