INDEX
Explanations
mentions of political or government positions, specifically the title "Minister"
mentions of government officials or ministers
New Auto-Interp
Negative Logits
vent
-0.84
feat
-0.80
eros
-0.68
ever
-0.68
eat
-0.67
lightsaber
-0.65
gers
-0.64
rive
-0.64
ilers
-0.64
DIS
-0.63
POSITIVE LOGITS
ial
0.99
ially
0.98
arians
0.83
onse
0.79
ials
0.77
Minister
0.77
iture
0.77
arian
0.75
aceutical
0.73
itol
0.73
Activations Density 0.025%