INDEX
Explanations
words related to randomness or random selection
instances of the words "randomly" and "accidentally."
New Auto-Interp
Negative Logits
gers
-0.78
antics
-0.75
ger
-0.72
ador
-0.70
rers
-0.69
attainment
-0.68
soc
-0.68
uese
-0.68
ryu
-0.67
ges
-0.67
POSITIVE LOGITS
detonated
0.84
stumbled
0.80
worshipped
0.78
bumped
0.76
located
0.75
combust
0.75
planted
0.75
placed
0.74
timed
0.74
chose
0.73
Activations Density 0.034%