INDEX
Explanations
phrases related to political figures and action
references to political figures and their actions
New Auto-Interp
Negative Logits
frontman
-0.60
ãĥĦ
-0.52
surprisingly
-0.51
¶
-0.51
tips
-0.50
ashington
-0.49
®
-0.49
"[
-0.48
]),
-0.47
ielding
-0.47
POSITIVE LOGITS
..."
1.37
â̦"
1.33
)."
1.26
..."
1.15
.")
1.10
.""
1.10
!"
1.08
.'"
1.07
â̦"
1.07
â̦."
1.02
Activations Density 1.970%