INDEX
Explanations
mentions of children's books and their characteristics
New Auto-Interp
Negative Logits
.Modules
-0.15
Ladies
-0.15
ãĤ¤ãĥī
-0.14
leh
-0.14
partment
-0.14
åѦéĻ¢
-0.14
ãĥ¥
-0.14
uld
-0.14
Idol
-0.14
lac
-0.14
POSITIVE LOGITS
children
0.23
(children
0.19
children
0.18
story
0.17
ildren
0.17
illustrator
0.17
kids
0.17
read
0.16
.children
0.16
kids
0.16
Activations Density 0.148%