INDEX
Negative Logits
will
0.39
Sand
0.39
A
0.38
can
0.36
The
0.35
There
0.35
estrogen
0.35
Pearson
0.34
Salt
0.34
They
0.34
POSITIVE LOGITS
btw
0.61
😉
0.55
;)
0.55
rasında
0.55
!”,
0.54
!")
0.54
!!”
0.51
😉
0.51
btw
0.50
!”
0.50
Activations Density 0.032%