INDEX
Explanations
negations or terms indicating opposition
New Auto-Interp
Negative Logits
eport
-0.71
ritz
-0.71
uca
-0.69
esome
-0.66
andestine
-0.66
oteric
-0.66
ebin
-0.65
vier
-0.64
NetMessage
-0.64
onna
-0.63
POSITIVE LOGITS
represented
0.83
suited
0.69
portrayed
0.68
associated
0.68
aimed
0.67
belonging
0.65
İĭ
0.65
arising
0.64
depicted
0.64
ª
0.64
Activations Density 0.136%