INDEX
Explanations
proper names, specifically the name "Alfred" with a high activation value
mentions of specific names, particularly "Alfred" and "Gustav"
New Auto-Interp
Negative Logits
cript
-0.79
leading
-0.79
stakes
-0.77
former
-0.76
boarding
-0.75
BOOK
-0.75
ItemImage
-0.70
TBD
-0.68
cv
-0.68
leaders
-0.67
POSITIVE LOGITS
sson
1.04
Alfred
1.02
corrid
0.99
sburg
0.94
obser
0.86
ulla
0.84
ibrary
0.82
destro
0.80
anca
0.78
acquaintance
0.77
Activations Density 0.003%