INDEX
Explanations
mentions of specific ethnic or religious groups
references to specific marginalized groups and their social contexts
New Auto-Interp
Negative Logits
inventoryQuantity
-0.76
ĸļ
-0.75
Archdemon
-0.75
Rog
-0.73
ellipt
-0.73
arium
-0.72
eus
-0.69
ortex
-0.67
abolic
-0.63
Turing
-0.61
POSITIVE LOGITS
nationalist
1.02
istani
1.02
nationalists
1.01
communities
0.99
Nadu
0.99
minorities
0.98
pora
0.95
separatist
0.93
Muslims
0.92
separatists
0.89
Activations Density 0.190%