INDEX
Explanations
the name "Max" specifically, especially with a high activation value
the name "Max" in various contexts
New Auto-Interp
Negative Logits
GROUND
-0.82
RECT
-0.80
alam
-0.78
cipline
-0.77
keeper
-0.74
wark
-0.71
CHO
-0.70
soType
-0.69
velt
-0.69
taboola
-0.69
POSITIVE LOGITS
imus
1.42
imil
1.23
imize
1.17
Payne
1.01
imal
1.01
ima
0.95
itar
0.94
imates
0.91
imen
0.86
ims
0.85
Activations Density 0.010%