INDEX
Explanations
the word "mean" used in various contexts
statements that introduce exceptions or clarifications
New Auto-Interp
Negative Logits
utenberg
-0.78
isin
-0.78
antha
-0.73
tex
-0.71
pes
-0.71
Peb
-0.68
bj
-0.68
figured
-0.66
bah
-0.66
figure
-0.64
POSITIVE LOGITS
anymore
0.98
anything
0.94
nor
0.86
anyone
0.86
erest
0.83
anybody
0.81
necessarily
0.81
any
0.77
eworthy
0.71
goodbye
0.70
Activations Density 0.060%