INDEX
Explanations
instances of the word "on."
New Auto-Interp
Negative Logits
on
-0.18
owl
-0.17
oretical
-0.15
unami
-0.14
disposing
-0.14
quarters
-0.14
eki
-0.14
noon
-0.14
OnInit
-0.14
arness
-0.14
POSITIVE LOGITS
behalf
0.50
/off
0.30
shore
0.29
-site
0.27
occasion
0.27
-line
0.25
eway
0.24
/by
0.23
etime
0.23
-demand
0.22
Activations Density 0.507%