INDEX
Explanations
proper nouns ending with "bert."
the name "Gilbert."
New Auto-Interp
Negative Logits
lock
-0.70
swap
-0.68
swapped
-0.65
hereafter
-0.65
end
-0.63
Chain
-0.63
yours
-0.62
HI
-0.62
haven
-0.62
marathon
-0.62
POSITIVE LOGITS
bert
4.63
berto
1.71
bern
1.62
bart
1.46
berman
1.43
ber
1.41
enegger
1.24
BER
1.23
berger
1.21
bard
1.19
Activations Density 0.015%