INDEX
Explanations
proper nouns, specifically names
the name "Kirk" and its associated context
New Auto-Interp
Negative Logits
*/(
-0.83
ĻĤ
-0.77
abuse
-0.76
legraph
-0.73
ntil
-0.72
receptive
-0.72
outgoing
-0.71
ittee
-0.70
ccording
-0.69
uctor
-0.66
POSITIVE LOGITS
patrick
1.45
Kirk
1.17
Cousins
0.93
bys
0.90
lands
0.90
sey
0.86
Hamm
0.86
STON
0.84
lake
0.83
wall
0.83
Activations Density 0.003%