INDEX
Explanations
phrases related to protest and conflict
New Auto-Interp
Negative Logits
“
-1.62
“
-1.51
"
-1.39
,“
-1.30
。“
-1.15
—“
-1.14
-“
-1.12
:“
-1.09
{"-1.08
(“
-1.08
POSITIVE LOGITS
'
2.23
‘
1.93
『
1.61
-'
1.59
、『
1.57
('1.56
(‘
1.52
,'
1.50
。『
1.48
='
1.46
Activations Density 4.265%