INDEX
Explanations
negations or refutations of claims
negations related to various topics or statements
New Auto-Interp
Negative Logits
ãĤ¼ãĤ¦ãĤ¹
-0.72
eur
-0.69
çīĪ
-0.69
stakes
-0.69
itech
-0.66
¿½
-0.64
Finder
-0.64
Org
-0.63
å¥
-0.63
Films
-0.62
POSITIVE LOGITS
adequately
1.17
necessarily
1.17
icably
1.09
sufficiently
0.97
properly
0.94
epad
0.94
icable
0.94
hin
0.90
assimil
0.89
bother
0.89
Activations Density 0.228%