INDEX
Explanations
themes related to appreciation and the value of repairing and maintaining possessions, along with various aspects of societal processes and responsibilities
New Auto-Interp
Negative Logits
():↵↵
-0.21
":↵↵
-0.20
:↵↵↵
-0.20
:↵↵↵↵
-0.18
':↵↵
-0.17
:↵↵↵↵↵↵
-0.17
ï¼ļ↵↵
-0.17
>).
-0.16
{}.-0.16
]:↵↵
-0.15
POSITIVE LOGITS
;↵
0.28
ï¼Ľ↵
0.24
);↵
0.23
;↵
0.22
)↵
0.22
]↵
0.22
!;↵
0.21
];↵
0.20
()↵
0.20
}↵
0.19
Activations Density 0.884%