INDEX
Explanations
names ending in 'bert'
the repetition of the name "Bert" in various contexts
New Auto-Interp
Negative Logits
esthetic
-0.78
heed
-0.74
ntil
-0.72
phis
-0.70
IFT
-0.69
olved
-0.68
istor
-0.65
oter
-0.65
stream
-0.65
phan
-0.64
POSITIVE LOGITS
atoes
0.86
stadt
0.79
Hoover
0.78
Ames
0.76
Blumenthal
0.75
stown
0.73
rics
0.73
sson
0.72
furt
0.71
aum
0.70
Activations Density 0.019%