INDEX
Explanations
references to various ethnic groups and nationalities
New Auto-Interp
Negative Logits
AssemblyProduct
-0.74
twimg
-0.68
IntoConstraints
-0.63
defStyle
-0.61
препратки
-0.60
GenerationType
-0.60
Бахар
-0.58
دانشنامهٔ
-0.57
Демографія
-0.54
StructEnd
-0.53
POSITIVE LOGITS
Jewish
0.60
\{\\0.57
minority
0.52
Muslim
0.51
Gypsy
0.50
Jews
0.50
Hentet
0.49
family
0.49
Jewish
0.48
Mediabestanden
0.48
Activations Density 0.326%