INDEX
Explanations
phrases related to political controversy and extremism
New Auto-Interp
Negative Logits
mosqu
-0.86
satell
-0.83
extinguished
-0.82
ikuman
-0.81
glim
-0.80
defe
-0.78
ishable
-0.76
steady
-0.76
transition
-0.76
gobl
-0.74
POSITIVE LOGITS
Additionally
1.42
Apparently
1.40
Furthermore
1.40
Specifically
1.40
According
1.40
Interestingly
1.33
Needless
1.31
Ironically
1.31
Moreover
1.30
Meanwhile
1.30
Activations Density 0.768%