INDEX
Explanations
references to political and social issues
statements regarding societal consequences and governance issues
New Auto-Interp
Negative Logits
ItemTracker
-0.72
batch
-0.71
PF
-0.70
Sample
-0.69
CV
-0.69
arthed
-0.68
earable
-0.67
ãĤ´ãĥ³
-0.65
OVER
-0.64
recent
-0.63
POSITIVE LOGITS
â̦"
0.98
patri
0.96
.'"
0.94
.""
0.94
heid
0.90
democracy
0.87
."[
0.87
..."
0.85
democracies
0.85
,'"
0.84
Activations Density 1.500%