INDEX
Explanations
discussions about ideological influence in education
New Auto-Interp
Negative Logits
rze
-0.16
.struct
-0.14
cia
-0.14
Owen
-0.13
erg
-0.13
Lightning
-0.13
ÄĻ
-0.13
nard
-0.13
asz
-0.13
lon
-0.13
POSITIVE LOGITS
instead
0.21
instead
0.18
Instead
0.16
Instead
0.15
ëĿ½
0.15
ideo
0.14
_NC
0.14
serter
0.14
orest
0.14
pio
0.14
Activations Density 0.212%