INDEX
Explanations
sections of text related to an individual's life history and personal background
New Auto-Interp
Negative Logits
odash
-0.16
418
-0.15
485
-0.15
ickle
-0.14
aki
-0.14
ispens
-0.14
enton
-0.14
Dig
-0.14
Decl
-0.14
elde
-0.14
POSITIVE LOGITS
McGregor
0.17
ohl
0.17
shan
0.16
Mant
0.14
æĹ
0.14
tal
0.14
ass
0.14
ngth
0.14
ibraries
0.14
anh
0.13
Activations Density 0.006%