INDEX
Explanations
phrases related to addressing issues, making policy changes, and connecting with people in various contexts
New Auto-Interp
Negative Logits
astical
-0.89
effects
-0.80
robe
-0.75
icity
-0.72
ardless
-0.71
claimed
-0.68
ventions
-0.67
similarly
-0.65
similar
-0.64
Cas
-0.63
POSITIVE LOGITS
Heller
0.65
Canaver
0.65
Mehran
0.64
nutshell
0.63
Scrib
0.63
Palin
0.62
Hannity
0.61
Leh
0.60
motivating
0.60
Rudd
0.60
Activations Density 1.153%