INDEX
Explanations
phrases related to familial connections and relationships
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.09
3:0.06
4:0.14
5:0.03
6:0.03
7:0.37
8:0.03
9:0.04
10:0.07
11:0.06
Negative Logits
nih
-1.69
god
-1.53
resp
-1.51
mbuds
-1.44
ysis
-1.42
ビ
-1.42
cancer
-1.41
ukemia
-1.37
anooga
-1.37
zers
-1.37
POSITIVE LOGITS
seamlessly
2.00
neatly
1.94
seamless
1.65
strands
1.61
woven
1.60
weave
1.56
intertw
1.55
rette
1.54
sands
1.53
colourful
1.52
Activations Density 0.001%