INDEX
Explanations
terms related to education and public institutions
New Auto-Interp
Negative Logits
-0.17
:
-0.16
,
-0.16
various
-0.15
ain
-0.15
Various
-0.15
down
-0.15
fy
-0.15
.
-0.15
inform
-0.15
POSITIVE LOGITS
ä¹ĭä¸Ģ
0.21
OMIT
0.16
sumer
0.16
nze
0.15
åľ°æĸ¹
0.14
leftright
0.14
θι
0.14
,'#
0.14
spiel
0.14
>",
0.14
Activations Density 0.100%