INDEX
Explanations
references to the continent of Africa
references to African people, culture, or topics related specifically to Africa
New Auto-Interp
Negative Logits
ORED
-0.78
IDER
-0.72
Perkins
-0.71
OPER
-0.71
INESS
-0.70
代
-0.70
ãģ®éŃĶ
-0.70
MFT
-0.68
士
-0.68
UTION
-0.67
POSITIVE LOGITS
raid
0.94
eatures
0.92
wana
0.92
onso
0.90
rika
0.87
ghan
0.82
rica
0.81
rican
0.80
aren
0.79
riend
0.79
Activations Density 0.003%