INDEX
Explanations
sentences expressing a negative viewpoint or disagreement
negations and expressions of inability or lack
New Auto-Interp
Negative Logits
ription
-0.74
lesi
-0.70
plates
-0.69
plate
-0.68
rys
-0.67
pora
-0.67
andan
-0.66
odium
-0.65
istrates
-0.65
alian
-0.64
POSITIVE LOGITS
darn
0.77
èª
0.74
Õ
0.74
\\\\
0.72
æĦ
0.67
íķ
0.66
ãģł
0.66
âķIJ
0.66
Availability
0.65
ifiable
0.65
Activations Density 0.063%