INDEX
Explanations
instances of numerical indicators or references to figures
New Auto-Interp
Negative Logits
cline
-0.83
giro
-0.74
)}}{-0.74
alz
-0.72
Bort
-0.72
Rooney
-0.71
بال
-0.70
Picchu
-0.69
ctc
-0.69
Él
-0.68
POSITIVE LOGITS
Jeffries
0.84
horabuena
0.77
yā
0.76
itize
0.75
0.73
Dummies
0.72
begingroup
0.71
paramString
0.71
Sinha
0.71
Fournier
0.71
Activations Density 0.011%