INDEX
Explanations
announcements and updates about events or projects
New Auto-Interp
Negative Logits
disputed
-0.75
discredited
-0.69
repud
-0.67
eroded
-0.67
discrimination
-0.66
SPONSORED
-0.66
Prosecutors
-0.63
dispro
-0.63
rejected
-0.62
coercive
-0.62
POSITIVE LOGITS
THANK
1.07
Enjoy
0.91
PLEASE
0.90
thank
0.89
Patreon
0.88
Tune
0.88
Alright
0.87
congr
0.87
ðŁĻĤ
0.85
please
0.84
Activations Density 0.611%