INDEX
Explanations
phrases related to political news and activities
New Auto-Interp
Negative Logits
.<
-0.69
.:
-0.69
!!!!!
-0.63
.",
-0.62
%.
-0.60
"!
-0.60
.(
-0.59
.,"
-0.58
unless
-0.57
."
-0.57
POSITIVE LOGITS
*)
0.96
})
0.90
)}
0.87
)]
0.86
?)
0.82
)]
0.81
?)
0.81
)\
0.75
?),
0.75
+)
0.73
Activations Density 6.479%