INDEX
Explanations
mention of specific events or details in news articles
references to leadership positions and transitions
New Auto-Interp
Negative Logits
."[
-0.68
+.
-0.64
".[
-0.63
%.
-0.63
.�
-0.62
usercontent
-0.61
".
-0.60
.).
-0.60
.<
-0.59
!".
-0.59
POSITIVE LOGITS
wealth
0.52
leaf
0.50
pires
0.49
bernatorial
0.48
urances
0.48
sequ
0.47
wedding
0.46
Ĭ±
0.46
©¶æ
0.46
iph
0.45
Activations Density 2.446%