INDEX
Explanations
proper nouns or names
occurrences of the word "the."
New Auto-Interp
Negative Logits
ambo
-0.93
abul
-0.75
tions
-0.74
ndra
-0.72
gb
-0.71
Operation
-0.69
aza
-0.67
ioned
-0.67
Grab
-0.66
Fore
-0.66
POSITIVE LOGITS
brunt
1.19
plunge
1.04
reins
1.01
initiative
0.98
blame
0.94
guise
0.89
same
0.88
cues
0.86
liberty
0.85
heaviest
0.85
Activations Density 0.056%