INDEX
Explanations
words relating to spatial locations or positions
prepositions and conjunctions indicating relationships or positional contexts
New Auto-Interp
Negative Logits
answ
-0.72
referen
-0.64
apologizing
-0.64
furthermore
-0.63
OUP
-0.63
warn
-0.62
moreover
-0.61
lineback
-0.61
ritical
-0.61
redibly
-0.61
POSITIVE LOGITS
aband
0.79
urned
0.67
edge
0.64
Pop
0.64
hops
0.63
pole
0.62
igate
0.62
weddings
0.62
ses
0.61
iped
0.60
Activations Density 0.528%