INDEX
Explanations
themes related to national identity and social issues
New Auto-Interp
Negative Logits
ajaran
-0.16
oeff
-0.15
personally
-0.15
idis
-0.14
persön
-0.14
itto
-0.14
dda
-0.14
acier
-0.14
amacare
-0.14
ahat
-0.14
POSITIVE LOGITS
itself
0.26
its
0.23
-wide
0.22
imar
0.20
wide
0.19
Its
0.17
Its
0.17
collectively
0.16
conv
0.16
orer
0.15
Activations Density 0.220%