INDEX
Explanations
references to heroic actions or heroic individuals
words associated with heroism and bravery
New Auto-Interp
Negative Logits
enfranch
-0.83
upon
-0.79
washer
-0.74
spring
-0.72
estate
-0.72
agree
-0.71
payer
-0.71
essee
-0.70
arate
-0.70
oval
-0.68
POSITIVE LOGITS
bravery
1.01
heroic
0.93
daring
0.92
heroism
0.92
courage
0.90
brave
0.84
courageous
0.84
feats
0.83
Expedition
0.78
inspiring
0.75
Activations Density 0.026%