INDEX
Explanations
indicators of educational resources and discussions about historical events, particularly regarding race and stereotypes
New Auto-Interp
Negative Logits
ideia
-0.55
orance
-0.54
zeitung
-0.54
ytale
-0.54
enfans
-0.54
henswürdigkeiten
-0.53
Zustimmung
-0.52
xrange
-0.52
ILayout
-0.52
ARANCE
-0.51
POSITIVE LOGITS
•
2.08
•
1.98
.•
1.67
••
1.66
)•
1.61
•••
1.46
••••
1.41
••
1.38
€¢
1.30
~•
1.23
Activations Density 0.148%