INDEX
Explanations
references to various forms of media and their relationship with societal issues
New Auto-Interp
Negative Logits
άνα
-0.15
'&#
-0.14
emen
-0.14
omore
-0.13
ska
-0.13
aan
-0.13
áž
-0.13
ana
-0.13
à¸Ļว
-0.13
obi
-0.13
POSITIVE LOGITS
than
0.20
besides
0.19
than
0.19
-than
0.18
world
0.17
niż
0.16
equally
0.15
uch
0.15
_than
0.15
bes
0.14
Activations Density 0.315%