INDEX
Explanations
words related to direct quotes by someone
references to characters, specifically male and female pronouns
New Auto-Interp
Negative Logits
Hub
-0.65
Front
-0.65
Berk
-0.64
internationally
-0.64
HG
-0.63
Royale
-0.62
Argon
-0.62
nationally
-0.60
Batt
-0.60
Azerb
-0.59
POSITIVE LOGITS
uristic
0.98
zbollah
0.90
eded
0.88
ngth
0.88
didnt
0.88
mos
0.85
lder
0.85
aped
0.85
gemony
0.85
areth
0.84
Activations Density 0.057%