INDEX
Explanations
ratios and numerical relationships within text
textual references to statistical ratios
New Auto-Interp
Negative Logits
gotten
-0.88
inelli
-0.79
dream
-0.79
uggage
-0.71
REAM
-0.68
usions
-0.64
-0.64
asts
-0.64
bies
-0.63
ORE
-0.63
POSITIVE LOGITS
xual
1.08
ratios
0.97
ratio
0.97
mismatch
0.85
yip
0.85
numer
0.79
ically
0.75
ical
0.72
sum
0.70
nces
0.69
Activations Density 0.018%