INDEX
Explanations
references to economic disparities or financial burdens
New Auto-Interp
Negative Logits
èĬĿ
-0.16
opa
-0.15
ettel
-0.15
mgr
-0.14
ØŃÙĩ
-0.14
askell
-0.14
Islam
-0.14
usi
-0.14
alytics
-0.13
viar
-0.13
POSITIVE LOGITS
Aviv
0.23
ynet
0.22
×
0.22
×ij
0.22
Israeli
0.21
ש
0.21
×ķ×
0.21
Netz
0.20
×Ļ×
0.20
Israel
0.20
Activations Density 0.314%