INDEX
Explanations
mentions of government officials and their titles
references to government ministers and their roles
New Auto-Interp
Negative Logits
vent
-0.74
feat
-0.63
theaters
-0.59
gers
-0.57
Reincarn
-0.57
Artists
-0.57
Savior
-0.57
resemblance
-0.57
century
-0.56
superiority
-0.56
POSITIVE LOGITS
ial
1.21
arians
1.14
arian
1.10
ials
1.01
ially
0.94
onse
0.86
iate
0.80
itol
0.79
isms
0.77
oun
0.76
Activations Density 0.030%