INDEX
Explanations
prepositions followed by nouns or verb phrases indicating conflict or opposition
prepositions and phrases indicating opposition or conditions
New Auto-Interp
Negative Logits
mask
-0.73
âĨij
-0.65
liner
-0.65
fortunately
-0.60
LV
-0.59
MU
-0.58
KNOWN
-0.58
vill
-0.58
hent
-0.58
CLS
-0.58
POSITIVE LOGITS
terms
0.63
behalf
0.61
regard
0.59
assian
0.59
tein
0.57
bidding
0.56
transact
0.55
bids
0.55
ertodd
0.54
ussian
0.54
Activations Density 0.586%