INDEX
Explanations
references to early development and childhood themes
New Auto-Interp
Negative Logits
bak
-0.20
older
-0.16
at
-0.16
maker
-0.14
ig
-0.14
ably
-0.14
able
-0.14
.Apis
-0.14
iram
-0.14
ëŁī
-0.14
POSITIVE LOGITS
-stage
0.29
stages
0.24
-warning
0.21
zeitig
0.20
stage
0.20
wood
0.20
morning
0.18
-middle
0.18
ãĢģä¸Ń
0.18
adopt
0.17
Activations Density 0.043%