INDEX
Explanations
references to people's backgrounds or associations
the article "a" in various contexts
New Auto-Interp
Negative Logits
easy
-0.76
evidence
-0.74
agree
-0.71
encies
-0.70
IMAGES
-0.70
Actions
-0.69
alerts
-0.69
docs
-0.67
views
-0.67
iev
-0.67
POSITIVE LOGITS
handful
0.93
dozen
0.88
whopping
0.87
mixture
0.83
unts
0.83
lot
0.82
sizeable
0.82
similar
0.81
sizable
0.80
lyss
0.80
Activations Density 0.574%