INDEX
Explanations
references to prayers and religious rituals
New Auto-Interp
Negative Logits
aji
-0.22
sez
-0.17
aci
-0.16
udies
-0.15
_controls
-0.15
opsis
-0.15
ëķ
-0.15
ulus
-0.14
ɵ
-0.14
.dsl
-0.14
POSITIVE LOGITS
×
0.24
×ķ
0.23
Netz
0.21
Ö
0.21
×ij
0.20
chem
0.20
×IJ
0.20
×
0.19
×ŀ
0.18
ש
0.18
Activations Density 0.023%