INDEX
Explanations
questions and reflections on actions or decisions to be taken in various contexts of societal, ethical, or personal significance
New Auto-Interp
Negative Logits
ursion
-0.26
FX
-0.25
seism
-0.24
origin
-0.23
Dahl
-0.23
flies
-0.23
cre
-0.23
hyde
-0.22
tornado
-0.22
isite
-0.22
POSITIVE LOGITS
verage
0.30
FTWARE
0.25
ATHER
0.25
EVEN
0.25
ldon
0.24
hire
0.24
ILCS
0.23
<?
0.23
ickson
0.23
avers
0.23
Activations Density 0.344%