INDEX
Explanations
instances of numerical references and comparisons
New Auto-Interp
Negative Logits
lance
-0.69
Refer
-0.68
JO
-0.67
AAF
-0.67
MSN
-0.65
Signs
-0.64
TOUR
-0.64
conservancy
-0.63
PID
-0.63
Pione
-0.61
POSITIVE LOGITS
ths
1.12
teenth
1.10
ieth
0.96
enged
0.87
eenth
0.79
tarian
0.75
ighty
0.75
teen
0.74
undred
0.74
punch
0.73
Activations Density 0.024%