INDEX
Explanations
proper names, particularly names that start with "Bert" or "Betty"
references to a specific name, particularly "Bert," in various contexts
New Auto-Interp
Negative Logits
ntax
-0.73
exclusive
-0.73
esthetic
-0.71
ATIVE
-0.68
othal
-0.68
PLIED
-0.66
Drug
-0.66
phones
-0.65
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.64
portation
-0.64
POSITIVE LOGITS
rand
1.08
ric
1.01
oken
0.98
ram
0.97
rics
0.97
ardo
0.97
ucci
0.96
olini
0.94
ards
0.91
rams
0.89
Activations Density 0.009%