INDEX
Explanations
references to family members, particularly siblings
references to siblings and twin relationships
New Auto-Interp
Negative Logits
cens
-0.72
ventory
-0.71
Talks
-0.68
nesty
-0.68
hops
-0.67
rays
-0.67
Bots
-0.66
Numbers
-0.65
amnesty
-0.64
robe
-0.64
POSITIVE LOGITS
sibling
2.38
twin
2.17
siblings
1.87
Twin
1.26
iblings
1.09
ibling
0.99
sister
0.99
brother
0.98
cousins
0.92
cousin
0.91
Activations Density 0.011%