INDEX
Negative Logits
citizens
-0.61
citizens
-0.57
Citizens
-0.52
UnsafeEnabled
-0.52
Citizens
-0.50
TestBed
-0.50
citizen
-0.49
ciudadanía
-0.49
Citizen
-0.48
Bourgoin
-0.48
POSITIVE LOGITS
незавершена
0.63
lèvres
0.63
TagMode
0.60
متعلقه
0.57
躇
0.56
انيف
0.55
ages
0.55
andidaten
0.54
edi
0.53
etsk
0.53
Activations Density 0.046%