INDEX
Explanations
mentions of helmets
occurrences of the word "helmet."
New Auto-Interp
Negative Logits
VD
-0.71
ABE
-0.67
tery
-0.66
ween
-0.66
hower
-0.66
Huntington
-0.64
atoes
-0.64
Bulletin
-0.64
Radical
-0.64
Roosevelt
-0.64
POSITIVE LOGITS
worn
1.03
helmets
1.03
helmet
0.97
wearer
0.86
Helmet
0.86
goggles
0.83
equipped
0.77
lain
0.76
adorned
0.73
wash
0.72
Activations Density 0.019%