INDEX
Explanations
themes related to self-sacrifice and altruism
New Auto-Interp
Negative Logits
.scalablytyped
-0.15
SPDX
-0.14
ikel
-0.14
]={↵-0.13
νη
-0.13
roud
-0.13
jal
-0.13
ÃŃÅ¡e
-0.13
ickets
-0.13
ope
-0.13
POSITIVE LOGITS
sacrifice
0.60
sacrifices
0.55
Sacr
0.52
sacrific
0.52
sacrificed
0.52
sacrificing
0.51
sacr
0.51
sac
0.46
SAC
0.44
trade
0.38
Activations Density 0.192%