INDEX
Explanations
occurrences of the word "first" and related phrases
New Auto-Interp
Negative Logits
ixa
-0.18
ixo
-0.17
further
-0.16
dal
-0.16
forth
-0.16
odesk
-0.15
plib
-0.14
pery
-0.14
afort
-0.14
rrha
-0.14
POSITIVE LOGITS
-ever
0.37
s
0.35
born
0.30
tiên
0.30
-hand
0.29
-rate
0.28
timers
0.26
responders
0.25
-order
0.25
-degree
0.24
Activations Density 0.127%