INDEX
Explanations
occurrences of the article "a."
New Auto-Interp
Negative Logits
Clarkson
-0.71
forth
-0.71
Angus
-0.71
Jagu
-0.69
Borders
-0.66
Emerson
-0.65
appointments
-0.64
Eag
-0.63
Allied
-0.62
quotas
-0.62
POSITIVE LOGITS
sexual
0.81
ria
0.78
cess
0.78
lder
0.78
vec
0.78
guest
0.77
][
0.76
ird
0.75
lex
0.75
pert
0.68
Activations Density 0.059%