INDEX
Explanations
the names of specific government officials
names of government officials and their mentions in discussions
New Auto-Interp
Negative Logits
Avalon
-0.67
Jagu
-0.66
¥µ
-0.64
itar
-0.63
Betty
-0.63
Sheep
-0.62
Pyr
-0.62
Babe
-0.61
Dominion
-0.61
Proposition
-0.60
POSITIVE LOGITS
briefed
1.06
memos
0.92
aide
0.91
oversaw
0.90
briefings
0.89
testified
0.86
resigned
0.84
chaired
0.83
Jinping
0.82
uchin
0.81
Activations Density 0.128%