INDEX
Explanations
Twitter handles with varying numbers in them
user mentions or social media handles
New Auto-Interp
Negative Logits
shrink
-0.72
breadth
-0.66
captcha
-0.65
shortage
-0.62
NetMessage
-0.60
swelling
-0.60
Gree
-0.59
Barrier
-0.59
pressure
-0.59
Bound
-0.59
POSITIVE LOGITS
/)
0.94
Jr
0.83
)
0.83
reports
0.82
CTV
0.82
yssey
0.80
enegger
0.78
afort
0.78
hester
0.77
/.
0.77
Activations Density 0.077%