INDEX
Explanations
phrases related to critical commentary or negative viewpoints
expressions of sentiment or opinion, especially negative emotions related to events or situations
New Auto-Interp
Negative Logits
?).
-0.85
.).
-0.74
etheless
-0.74
.*
-0.69
+.
-0.66
.)
-0.66
).
-0.64
arthed
-0.61
odox
-0.61
arist
-0.60
POSITIVE LOGITS
,"
1.13
%"
1.07
[
1.07
.,"
0.92
",
0.92
":
0.91
,'"
0.89
,''
0.85
"]
0.83
"
0.82
Activations Density 0.740%