INDEX
Explanations
references to familial relationships, particularly those involving mothers
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.07
3:0.08
4:0.04
5:0.03
6:0.12
7:0.06
8:0.05
9:0.30
10:0.03
11:0.03
Negative Logits
��
-3.88
HF
-3.62
LX
-3.26
++++
-3.15
Warranty
-3.08
hire
-3.06
XIV
-3.03
Wine
-3.02
icious
-3.01
XIII
-2.95
POSITIVE LOGITS
mother
3.94
mom
3.81
parents
3.39
classmates
3.34
mum
3.33
Maced
3.19
classmate
3.13
Mississ
3.06
Columb
3.02
recol
2.99
Activations Density 0.001%