INDEX
Negative Logits
æng
0.57
ising
0.57
ienze
0.56
otho
0.55
什么的
0.54
ৱ
0.53
prenez
0.52
sahaja
0.51
ياته
0.50
espécies
0.50
POSITIVE LOGITS
clinically
0.73
阉
0.71
antisemit
0.67
MODEL
0.66
clinical
0.65
gynec
0.64
клини
0.63
تنظيم
0.62
clin
0.59
पार्टी
0.59
Activations Density 0.001%