INDEX
Explanations
references to Italian culture or elements
New Auto-Interp
Negative Logits
saraba
-0.82
ngược
-0.61
mulut
-0.60
indépendant
-0.60
Demografía
-0.59
cytoplasm
-0.58
الإنجليزية
-0.57
colspan
-0.57
vidrio
-0.57
Cardona
-0.56
POSITIVE LOGITS
Italian
1.27
Italian
1.12
italian
1.09
Italy
1.06
italian
1.00
Italians
0.98
Italy
0.96
ITAL
0.96
Itali
0.90
италья
0.87
Activations Density 0.088%