INDEX
Explanations
numbers with percentage values
occurrences of the article "a" in various contexts
New Auto-Interp
Negative Logits
agents
-0.70
CHAPTER
-0.67
extrad
-0.65
Presents
-0.64
ovie
-0.64
things
-0.63
Edit
-0.62
dishes
-0.62
authent
-0.61
outlaw
-0.61
POSITIVE LOGITS
whopping
1.45
ratio
1.16
verages
1.05
staggering
1.04
median
1.02
negligible
0.96
fraction
0.96
slight
0.94
dismal
0.92
margin
0.89
Activations Density 0.155%