INDEX
Explanations
references to caregiving and nurturing actions
New Auto-Interp
Negative Logits
ramer
-0.18
inho
-0.17
ADE
-0.16
Clinton
-0.15
ator
-0.15
generics
-0.15
ators
-0.14
Burg
-0.14
astr
-0.14
ade
-0.14
POSITIVE LOGITS
********************************************************************************
0.16
ÙĪØ§Ø±
0.16
aju
0.15
ãĥĭãĥĥãĤ¯
0.15
éģĵè·¯
0.15
ambah
0.15
]={↵0.15
igel
0.14
ocratic
0.14
rium
0.14
Activations Density 0.001%