INDEX
Explanations
occurrences of the word "number" and related numerical terms
New Auto-Interp
Negative Logits
eward
-0.21
born
-0.17
ward
-0.16
utin
-0.16
eniz
-0.16
ัมà¸ŀ
-0.16
manner
-0.15
wards
-0.15
avr
-0.15
ovy
-0.15
POSITIVE LOGITS
erable
0.20
icer
0.17
erb
0.17
rients
0.16
.gdx
0.15
/address
0.15
aciones
0.15
exion
0.15
óż
0.15
velle
0.15
Activations Density 0.083%