INDEX
Explanations
proper nouns, particularly names of people and organizations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.07
3:0.15
4:0.36
5:0.04
6:0.04
7:0.03
8:0.05
9:0.05
10:0.06
11:0.04
Negative Logits
wcs
-1.99
alties
-1.84
notation
-1.72
chwitz
-1.67
artifacts
-1.65
hasn
-1.64
istries
-1.60
sites
-1.59
fixtures
-1.58
weren
-1.57
POSITIVE LOGITS
Osw
1.81
(@
1.63
Wend
1.57
commenter
1.52
rhet
1.48
Krish
1.48
Psych
1.46
theolog
1.44
Ov
1.44
Shutterstock
1.43
Activations Density 0.042%