INDEX
Explanations
names or titles related to government officials
references to government officials and their roles
New Auto-Interp
Negative Logits
vent
-0.73
feat
-0.70
Reincarn
-0.64
Samurai
-0.62
vantage
-0.61
torch
-0.60
DIS
-0.59
lightsaber
-0.58
rive
-0.57
Totem
-0.57
POSITIVE LOGITS
ial
1.15
arians
1.08
ially
1.04
ials
0.98
arian
0.97
onse
0.88
itol
0.84
ional
0.81
iture
0.80
isms
0.79
Activations Density 0.030%