INDEX
Explanations
phrases related to political rhetoric and discourse
symbols or characters that might represent special formatting or non-standard text elements
New Auto-Interp
Negative Logits
shroud
-0.81
dispers
-0.78
sled
-0.76
assemb
-0.76
recogn
-0.72
nude
-0.71
conservancy
-0.71
closest
-0.70
shack
-0.69
distribut
-0.68
POSITIVE LOGITS
¬
1.60
ľ
1.59
¡
1.59
«
1.48
ª
1.47
Ļ
1.44
ł
1.42
ĸ
1.41
Ł
1.41
µ
1.39
Activations Density 0.144%