INDEX
Explanations
proper nouns such as names of people, places, and organizations
proper nouns and affiliations, particularly related to individuals and their backgrounds
New Auto-Interp
Negative Logits
onies
-0.78
Otherwise
-0.75
OTHER
-0.69
Examples
-0.68
Accessory
-0.67
Translation
-0.67
respective
-0.66
pmwiki
-0.65
Higher
-0.64
tips
-0.63
POSITIVE LOGITS
he
1.03
Mr
0.90
she
0.90
Dr
0.78
Conrad
0.78
Rodrigo
0.77
Prof
0.77
Jerome
0.77
Cecil
0.76
Denis
0.75
Activations Density 0.176%