INDEX
Explanations
themes of complexity and interconnected challenges in societal issues
New Auto-Interp
Negative Logits
997
-0.15
exactly
-0.15
998
-0.14
wÅĤa
-0.13
imagining
-0.13
somewhere
-0.13
_TRANSFORM
-0.13
somew
-0.13
imagined
-0.13
adem
-0.13
POSITIVE LOGITS
moral
0.21
universal
0.20
survival
0.20
ethical
0.18
sensitive
0.18
ethics
0.17
cross
0.17
Survival
0.17
universal
0.17
morality
0.16
Activations Density 0.073%