INDEX
Explanations
references to Asian ethnicities and cultures
New Auto-Interp
Negative Logits
الحياه
-0.49
ske
-0.36
يتيمه
-0.36
erhalten
-0.35
sikker
-0.34
Ghar
-0.34
spør
-0.33
Ske
-0.33
Talla
-0.33
ciler
-0.33
POSITIVE LOGITS
Asian
0.85
Chinese
0.84
Chinese
0.81
Asians
0.80
Asian
0.80
asian
0.78
chinese
0.78
chinois
0.77
japones
0.74
Taiwan
0.73
Activations Density 0.669%