INDEX
Explanations
phrases related to political discussion and controversial topics
New Auto-Interp
Negative Logits
erves
-0.74
=-=-=-=-
-0.68
rehens
-0.68
nevertheless
-0.68
_-
-0.65
furthermore
-0.64
ÃŃs
-0.64
ATCH
-0.63
nonetheless
-0.62
Ö¼
-0.61
POSITIVE LOGITS
judicial
0.89
"#
0.88
ordinary
0.85
caliphate
0.82
"
0.80
social
0.80
metadata
0.79
Islamic
0.79
advant
0.73
prime
0.69
Activations Density 0.023%