INDEX
Explanations
instances of things happening or being chosen without a specific pattern or reason
occurrences of the word "randomly" and related terms about random selection or action
New Auto-Interp
Negative Logits
ties
-0.79
Operation
-0.74
ger
-0.74
Millennium
-0.73
gers
-0.71
Agenda
-0.67
iens
-0.67
soc
-0.66
intentions
-0.66
Killer
-0.66
POSITIVE LOGITS
detonated
0.95
planted
0.92
randomly
0.89
sampled
0.88
combust
0.88
implanted
0.86
configured
0.85
sacrificed
0.85
chose
0.83
regenerate
0.82
Activations Density 0.017%