INDEX
Explanations
political tension intrigue corruption
New Auto-Interp
Negative Logits
0
1.65
Atlet
1.56
}$.
1.55
து
1.50
ного
1.46
ğı
1.45
ний
1.41
}{1.38
1
1.38
}$,
1.38
POSITIVE LOGITS
étrang
1.65
iname
1.51
靂
1.49
দিনই
1.48
Secara
1.45
ي
1.43
ли
1.41
ettiin
1.39
ynomial
1.38
నిక
1.38
Activations Density 0.024%