INDEX
Explanations
proper nouns, particularly names of individuals
New Auto-Interp
Head Attr Weights
0:0.16
1:0.03
2:0.14
3:0.07
4:0.14
5:0.05
6:0.05
7:0.03
8:0.06
9:0.08
10:0.08
11:0.05
Negative Logits
wors
-1.51
ONLY
-1.45
except
-1.44
owing
-1.40
inheritance
-1.39
direction
-1.37
WHERE
-1.37
")
-1.36
"),
-1.35
imports
-1.35
POSITIVE LOGITS
senal
1.85
Burton
1.81
Neal
1.80
田
1.79
verett
1.79
HAEL
1.73
letcher
1.68
Monte
1.65
ijah
1.65
Leslie
1.64
Activations Density 0.067%