INDEX
Explanations
sequences of numbers and characters formatted like status codes or identifiers
numerical status updates or identifiers related to social media posts
New Auto-Interp
Negative Logits
nomine
-0.84
DRAG
-0.64
¨
-0.63
masc
-0.62
Belt
-0.61
NRA
-0.60
successors
-0.58
chants
-0.58
Vest
-0.58
lé
-0.56
POSITIVE LOGITS
806
1.33
389
1.32
985
1.32
768
1.31
679
1.30
269
1.30
406
1.30
485
1.29
264
1.29
405
1.29
Activations Density 0.047%