INDEX
Explanations
high-ranking government officials and diplomats
names of individuals, particularly officials and their titles
New Auto-Interp
Negative Logits
Hallow
-0.76
juries
-0.74
Scand
-0.73
Proposition
-0.72
race
-0.70
Huck
-0.68
Crow
-0.68
represented
-0.66
Avalon
-0.66
Crom
-0.65
POSITIVE LOGITS
briefed
1.08
enei
1.06
Jinping
0.99
zinski
0.98
Lavrov
0.97
confid
0.97
briefings
0.96
memos
0.96
plom
0.94
tasked
0.94
Activations Density 0.195%