INDEX
Explanations
political and social commentary
New Auto-Interp
Negative Logits
rb
-0.78
Redditor
-0.74
RM
-0.74
:(
-0.71
tera
-0.67
":"/
-0.63
itor
-0.62
lear
-0.62
235
-0.62
:[
-0.62
POSITIVE LOGITS
assorted
1.28
etc
1.27
etc
1.23
others
0.89
whatever
0.89
possibly
0.89
finally
0.88
other
0.86
even
0.84
downright
0.83
Activations Density 1.347%