INDEX
Explanations
specific mentions of the name "Edward"
mentions of the name "Edward."
New Auto-Interp
Negative Logits
nda
-0.71
place
-0.68
snail
-0.63
sensit
-0.62
assian
-0.62
tsky
-0.62
empl
-0.61
kick
-0.60
internationally
-0.59
calling
-0.59
POSITIVE LOGITS
Edward
1.07
Snowden
1.04
Byrne
0.86
sson
0.85
ilant
0.85
Edward
0.83
Cullen
0.76
Nero
0.75
rey
0.75
Heath
0.74
Activations Density 0.004%