INDEX
Explanations
politically sensitive phrases and discussions surrounding social issues
New Auto-Interp
Negative Logits
ople
-0.63
Tsukuyomi
-0.62
Tonight
-0.61
Interstitial
-0.61
"<
-0.61
ILCS
-0.61
Conquer
-0.60
ructose
-0.59
[_
-0.59
ItemLevel
-0.59
POSITIVE LOGITS
consideration
0.83
ilyn
0.82
factor
0.82
besides
0.81
noteworthy
0.79
relates
0.78
worldly
0.75
involves
0.75
orthy
0.71
mentioned
0.71
Activations Density 0.156%