INDEX
Explanations
references to people or profiles on social media platforms
mentions of social media activities and accounts
New Auto-Interp
Negative Logits
boarding
-0.62
blush
-0.61
mort
-0.60
ply
-0.59
dispers
-0.59
erous
-0.59
slic
-0.57
aging
-0.57
apart
-0.57
outl
-0.56
POSITIVE LOGITS
<|endoftext|>
0.98
Follow
0.96
Alternatively
0.91
Fans
0.86
Also
0.79
Got
0.79
Find
0.78
Tickets
0.77
HELP
0.76
Support
0.76
Activations Density 0.065%