INDEX
Explanations
references to historical events, particularly relating to population movements and socio-political contexts
New Auto-Interp
Negative Logits
¤ëĭ¤
-0.17
broken
-0.16
yaptıģı
-0.16
flown
-0.15
ridden
-0.15
ovna
-0.15
is
-0.15
аеÑĤÑģÑı
-0.15
spoken
-0.14
دارد
-0.14
POSITIVE LOGITS
were
0.35
ieron
0.34
aron
0.32
Were
0.31
were
0.30
کردÙĨد
0.30
Were
0.30
ÑģÑĤали
0.30
fueron
0.30
weren
0.28
Activations Density 0.035%