INDEX
Explanations
references to ethnic minority groups
minority groups
New Auto-Interp
Negative Logits
Ram
-0.42
launch
-0.41
Appl
-0.40
tourne
-0.39
plan
-0.39
Wel
-0.39
easy
-0.39
quoi
-0.39
Eng
-0.38
राम
-0.38
POSITIVE LOGITS
minority
2.33
Minority
2.08
minorities
1.87
ority
1.55
少数
1.15
MINOR
0.88
Minder
0.84
Majority
0.79
Majority
0.79
Personendaten
0.75
Activations Density 0.008%