INDEX
Negative Logits
Tweet
-0.08
Ana
-0.07
Thr
-0.07
撮
-0.07
ат
-0.07
Performed
-0.07
�
-0.07
خب
-0.06
walk
-0.06
sp
-0.06
POSITIVE LOGITS
Phys
0.07
nze
0.07
leanup
0.06
myself
0.06
game
0.06
down
0.06
stopped
0.06
themselves
0.06
influencers
0.06
님
0.06
Activations Density 0.001%