INDEX
Explanations
mentions of specific names, particularly "Elon Musk" and "Theodore"
mentions of influential figures, particularly Elon Musk and Theodore Roosevelt
New Auto-Interp
Negative Logits
maid
-0.79
lez
-0.72
eding
-0.71
Torrent
-0.71
scenes
-0.69
esome
-0.69
united
-0.67
etz
-0.67
hops
-0.67
alez
-0.66
POSITIVE LOGITS
Musk
1.32
Elon
0.87
itability
0.81
uates
0.81
Roosevelt
0.80
Brock
0.77
Watts
0.76
ufact
0.76
ancock
0.74
Manning
0.73
Activations Density 0.025%