INDEX
Explanations
instances of key phrases and patterns indicative of discourse related to opinions or statements about political and social issues
New Auto-Interp
Negative Logits
øy
-0.17
939
-0.15
ãĥŃãĥ¼
-0.15
-LAST
-0.15
orthand
-0.15
ezier
-0.15
ĵ°
-0.15
QN
-0.15
acman
-0.14
ÑĩаÑĤ
-0.14
POSITIVE LOGITS
COVID
0.16
actually
0.15
0.15
[
0.14
↵
0.14
fo
0.14
indeed
0.14
Flush
0.14
s
0.13
nt
0.13
Activations Density 0.001%