INDEX
Explanations
references to familial relationships and name changes
New Auto-Interp
Negative Logits
Phrase
-0.17
'icon
-0.15
labeling
-0.15
bole
-0.15
zan
-0.14
KF
-0.14
Definitions
-0.14
uw
-0.14
309
-0.14
ichick
-0.13
POSITIVE LOGITS
given
0.38
given
0.34
middle
0.32
Given
0.31
Given
0.31
last
0.30
maiden
0.30
_given
0.29
pseud
0.29
GIVEN
0.28
Activations Density 0.165%