INDEX
Explanations
topics related to political issues and public policy
New Auto-Interp
Negative Logits
gató
-0.42
desee
-0.40
nemmeno
-0.38
major
-0.37
much
-0.37
suerte
-0.36
Sam
-0.36
fermés
-0.36
ًا
-0.36
(
-0.36
POSITIVE LOGITS
Мексичка
1.13
)++;
0.96
poffe
0.89
myſelf
0.87
raiſ
0.87
+#+#
0.87
neceff
0.86
houſe
0.85
pleaſure
0.85
ſever
0.82
Activations Density 1.890%