INDEX
Explanations
references to historical figures, particularly those related to the Roosevelt family
references to the Roosevelt family, primarily Theodore and Franklin Roosevelt
New Auto-Interp
Negative Logits
RED
-0.72
leon
-0.69
LESS
-0.69
raints
-0.67
ATIVE
-0.66
ateurs
-0.65
binary
-0.64
las
-0.63
Stand
-0.62
rators
-0.62
POSITIVE LOGITS
Roosevelt
1.15
velt
1.10
enthal
0.83
iets
0.83
éĹĺ
0.82
ufact
0.81
hower
0.80
enstein
0.77
mosqu
0.75
udeau
0.75
Activations Density 0.010%