INDEX
Explanations
phrases related to personal and professional relationships
New Auto-Interp
Head Attr Weights
0:0.25
1:0.04
2:0.01
3:0.12
4:0.07
5:0.07
6:0.04
7:0.02
8:0.24
9:0.05
10:0.01
11:0.03
Negative Logits
enium
-1.78
uniformly
-1.64
nort
-1.57
blinked
-1.54
orange
-1.52
cloudy
-1.51
ammers
-1.51
clustered
-1.51
stabilized
-1.50
terday
-1.49
POSITIVE LOGITS
memoir
1.80
personal
1.76
intimate
1.74
ebin
1.71
affairs
1.69
condol
1.68
autobiography
1.64
aryn
1.61
alias
1.60
Footnote
1.59
Activations Density 0.042%