INDEX
Explanations
concepts related to fate and moral responsibility
New Auto-Interp
Negative Logits
lk
-0.16
lasting
-0.15
lasting
-0.15
ibase
-0.15
utschein
-0.15
ëł¹
-0.14
notated
-0.14
à¹Ģà¸ģà¸Ńร
-0.14
hã
-0.14
aign
-0.14
POSITIVE LOGITS
dictates
0.22
intervened
0.21
interven
0.20
intervening
0.20
intervene
0.18
favors
0.18
dictate
0.18
demand
0.17
Dict
0.17
Dict
0.17
Activations Density 0.307%