INDEX
Negative Logits
gays
-0.07
jokes
-0.07
Boston
-0.07
Coder
-0.07
Ny
-0.06
scooter
-0.06
hos
-0.06
smoke
-0.06
ignorance
-0.06
/access
-0.06
POSITIVE LOGITS
ี้
0.06
(hero
0.06
kitten
0.06
Lightning
0.06
.elements
0.06
/mit
0.06
Puppy
0.06
.password
0.06
partial
0.06
/thread
0.06
Activations Density 0.010%