INDEX
Negative Logits
조사
-0.09
Investig
-0.09
التهاب
-0.09
Investigation
-0.08
기사
-0.08
Ing
-0.08
Advertisement
-0.08
inflammation
-0.08
Sponsored
-0.08
inations
-0.07
POSITIVE LOGITS
diagonal
0.11
Diagonal
0.11
diag
0.09
diag
0.09
स्व
0.09
बराब
0.09
identity
0.09
swaps
0.09
espejo
0.08
equil
0.08
Activations Density 0.033%