INDEX
Explanations
Twitter usernames
the closing parentheses in text, indicating a format related to social media interactions or posts
New Auto-Interp
Negative Logits
icion
-0.79
omorphic
-0.74
othermal
-0.67
ients
-0.66
climbers
-0.66
blers
-0.66
perate
-0.64
indal
-0.63
olars
-0.62
houses
-0.62
POSITIVE LOGITS
October
0.88
September
0.88
August
0.82
November
0.81
June
0.79
September
0.79
December
0.78
May
0.78
October
0.78
February
0.77
Activations Density 0.032%