INDEX
Explanations
references to societal perceptions and critiques of leadership and authority
New Auto-Interp
Negative Logits
Nom
-0.17
Hab
-0.16
å»Ĭ
-0.14
SystemService
-0.14
Ả
-0.14
ë§ĪìĿĮ
-0.14
รม
-0.14
Nom
-0.13
ismet
-0.13
nick
-0.13
POSITIVE LOGITS
importance
0.20
sworth
0.19
significance
0.18
/tab
0.17
import
0.17
Importance
0.16
ÙħÙĤد
0.16
special
0.16
اÙĩÙħÛĮت
0.15
éĩįè¦ģ
0.15
Activations Density 0.204%