INDEX
Explanations
references to societal issues and the impact of leadership
New Auto-Interp
Negative Logits
æļ
-0.15
iese
-0.14
INCLUDING
-0.13
ayrıca
-0.13
æĺ¯ä»Ģä¹Ī
-0.13
åĮħæĭ¬
-0.13
iyon
-0.13
gá»ĵm
-0.13
aled
-0.13
orthand
-0.13
POSITIVE LOGITS
напÑĢимеÑĢ
0.33
etc
0.33
ÙħØ«ÙĦا
0.29
example
0.29
exemplo
0.24
напÑĢиклад
0.23
etc
0.23
exemple
0.23
example
0.22
ãģªãģ©
0.22
Activations Density 0.576%