INDEX
Negative Logits
Polygon
-0.07
brothers
-0.07
资产
-0.07
p
-0.06
para
-0.06
querying
-0.06
ीएस
-0.06
eldre
-0.06
compat
-0.06
empathy
-0.06
POSITIVE LOGITS
jail
0.16
Jail
0.15
jails
0.11
jailed
0.11
ail
0.09
Bail
0.07
AIL
0.07
jit
0.07
hail
0.07
antanamo
0.07
Activations Density 0.002%