INDEX
Explanations
mentions of specific names related to governmental officials
New Auto-Interp
Negative Logits
ulia
-0.68
juries
-0.67
acters
-0.66
Jagu
-0.65
skelet
-0.63
Ü
-0.63
Ideal
-0.63
Witches
-0.63
âĸ¬
-0.62
Load
-0.61
POSITIVE LOGITS
briefed
1.24
aide
1.04
chaired
0.99
briefings
0.98
oversaw
0.96
memos
0.95
reiterated
0.93
testified
0.92
resigned
0.92
confid
0.91
Activations Density 0.133%