INDEX
Negative Logits
dx
0.38
nosis
0.38
ytic
0.37
drig
0.36
禄
0.35
razio
0.35
muda
0.34
ellular
0.34
rischi
0.34
نکن
0.34
POSITIVE LOGITS
Independent
0.64
Independent
0.58
Independ
0.57
independ
0.57
independent
0.55
independent
0.55
Independence
0.53
independence
0.52
independents
0.51
independence
0.49
Activations Density 0.003%