INDEX
Explanations
mentions of deliberate or intentional actions
phrases related to intentionality and purpose
New Auto-Interp
Negative Logits
Tycoon
-0.74
Citation
-0.73
Rite
-0.70
Warriors
-0.70
Parables
-0.68
busters
-0.68
soon
-0.67
Roses
-0.66
addons
-0.65
Memories
-0.64
POSITIVE LOGITS
planted
0.83
plotted
0.76
sacrificed
0.75
sabot
0.74
avoided
0.73
ãĤ©
0.72
aling
0.72
reinvent
0.72
designed
0.72
designed
0.72
Activations Density 0.013%