INDEX
Explanations
prepositions followed by locations or actions
occurrences of the word "in."
New Auto-Interp
Negative Logits
onym
-0.80
WER
-0.68
publishes
-0.67
%%
-0.66
arnaev
-0.63
authors
-0.62
lett
-0.61
killed
-0.61
dor
-0.61
eteria
-0.60
POSITIVE LOGITS
favor
1.14
jury
1.11
lieu
1.03
favour
1.02
juries
1.00
ordinate
0.96
unison
0.95
tandem
0.95
animate
0.95
offensive
0.93
Activations Density 0.227%