INDEX
Explanations
references to political figures and actions related to governance
New Auto-Interp
Negative Logits
Folk
-0.72
partName
-0.71
contention
-0.64
battleground
-0.64
Canaan
-0.63
filler
-0.61
pilgr
-0.61
collaborations
-0.61
folklore
-0.59
Builder
-0.59
POSITIVE LOGITS
should
1.00
sic
0.92
efe
0.86
killed
0.82
ï¸ı
0.82
could
0.81
ould
0.80
mir
0.79
own
0.79
extremely
0.78
Activations Density 0.224%