INDEX
Negative Logits
ruining
0.40
慳
0.39
xb
0.38
causando
0.38
🧊
0.38
spoil
0.38
authorise
0.38
ruined
0.38
MERCHANTABILITY
0.37
collect
0.37
POSITIVE LOGITS
avor
0.43
地震
0.41
handmade
0.40
ripción
0.40
ному
0.39
regional
0.38
front
0.38
computers
0.38
terrorism
0.38
hare
0.38
Activations Density 0.000%