INDEX
Explanations
phrases related to controversial topics or actions
key phrases related to discoveries and significant events
New Auto-Interp
Negative Logits
!.
-0.77
KNOWN
-0.65
URA
-0.64
scrut
-0.63
};
-0.63
+.
-0.62
ANI
-0.62
!,
-0.60
TPPStreamerBot
-0.57
';
-0.57
POSITIVE LOGITS
lacks
0.74
shouldn
0.72
isn
0.66
violates
0.64
outweigh
0.64
should
0.63
proves
0.63
represents
0.63
undermines
0.63
constitutes
0.62
Activations Density 1.039%