INDEX
Explanations
information related to political figures, government actions, and policies
New Auto-Interp
Negative Logits
ãĤ´
-0.69
¯
-0.65
aurus
-0.65
=/
-0.64
forcer
-0.61
Ö¼
-0.60
aco
-0.60
ðŁĺ
-0.60
affer
-0.60
ouf
-0.60
POSITIVE LOGITS
respective
1.12
themselves
1.10
respectively
1.05
careers
0.94
varying
0.94
theirs
0.85
histories
0.85
individually
0.84
apiece
0.84
overlapping
0.83
Activations Density 3.097%