INDEX
Negative Logits
oust
-0.87
uty
-0.77
è¦ļéĨĴ
-0.75
Versions
-0.75
itles
-0.75
uid
-0.74
KA
-0.74
olphin
-0.73
inion
-0.72
arest
-0.71
POSITIVE LOGITS
lier
0.97
something
0.93
crap
0.93
somebody
0.86
a
0.86
someone
0.86
an
0.85
fun
0.83
paradise
0.81
cheating
0.79
Activations Density 0.057%