INDEX
Explanations
references to familial relationships and emotional connections
New Auto-Interp
Negative Logits
afi
-0.16
apon
-0.15
ults
-0.15
AtPath
-0.15
hic
-0.15
DIR
-0.14
omed
-0.14
Bene
-0.14
olumn
-0.13
undry
-0.13
POSITIVE LOGITS
leaves
0.28
Leaves
0.25
will
0.17
will
0.17
WILL
0.16
Will
0.16
será
0.15
Leaf
0.15
lived
0.15
joins
0.15
Activations Density 0.024%