INDEX
Explanations
high-ranking officials mentioned in news articles
references to government officials
New Auto-Interp
Negative Logits
ï¸
-0.94
Horses
-0.75
=-=-
-0.70
ĸļ
-0.69
âĶģ
-0.69
âķIJ
-0.68
esville
-0.67
Bengal
-0.66
||||
-0.65
bows
-0.64
POSITIVE LOGITS
dom
1.03
ially
0.86
doms
0.76
overseeing
0.76
tasked
0.74
IAL
0.71
iating
0.69
iaries
0.68
ulty
0.68
sanctioned
0.67
Activations Density 0.029%