INDEX
Explanations
words related to societal issues and government actions
calls to action and reflections on societal issues
New Auto-Interp
Negative Logits
)|
-0.63
®
-0.63
Below
-0.59
cous
-0.55
apps
-0.53
Below
-0.53
iciary
-0.52
Annotations
-0.52
usage
-0.51
modified
-0.51
POSITIVE LOGITS
even
0.75
also
0.71
likewise
0.66
EVEN
0.65
goddamn
0.62
Ŀ
0.61
Ͻ
0.61
drown
0.59
Lindsey
0.57
enough
0.55
Activations Density 0.457%