INDEX
Explanations
expressions of critical observations or reactions to societal issues
New Auto-Interp
Negative Logits
accordingly
-0.61
respectively
-0.59
atalytic
-0.54
jedenfalls
-0.53
kör
-0.51
eraard
-0.51
wnież
-0.50
oner
-0.50
XmlAttribute
-0.49
demais
-0.49
POSITIVE LOGITS
such
2.55
such
2.13
Such
2.00
SUCH
1.93
Such
1.87
solch
1.82
如此
1.76
столь
1.74
這麼
1.68
so
1.64
Activations Density 0.703%