INDEX
Negative Logits
Safety
0.47
Lent
0.42
Netanyahu
0.41
OSHA
0.39
comportamenti
0.39
격
0.38
Lit
0.38
Astrophysics
0.37
Nuclear
0.37
Safety
0.37
POSITIVE LOGITS
samano
0.39
三分
0.38
ongs
0.37
THREE
0.37
draper
0.37
ÔNG
0.37
antry
0.37
अर्स
0.36
᱓
0.36
drie
0.36
Activations Density 0.000%