INDEX
Explanations
proper nouns related to politics and individuals
references to specific individuals and their roles in contexts involving communication or relationships
New Auto-Interp
Negative Logits
ulously
-0.77
GGGGGGGG
-0.73
uctive
-0.73
perjury
-0.72
nesday
-0.70
igree
-0.70
aminer
-0.69
utherford
-0.67
spection
-0.66
PASS
-0.65
POSITIVE LOGITS
xus
0.83
®
0.80
§
0.78
«
0.76
eries
0.75
Ń
0.75
¨
0.73
Ĭ
0.73
©¶æ
0.69
er
0.69
Activations Density 0.013%