INDEX
Explanations
references to religious laws and teachings associated with Judaism and Islam
New Auto-Interp
Negative Logits
bore
-0.17
ahlen
-0.14
indie
-0.13
æ¿
-0.13
ito
-0.13
bv
-0.13
ware
-0.13
erve
-0.13
ge
-0.13
Feb
-0.13
POSITIVE LOGITS
edla
0.16
ivec
0.15
ież
0.15
cheng
0.14
ious
0.14
onse
0.14
aset
0.14
irut
0.14
laws
0.14
елик
0.14
Activations Density 0.139%