INDEX
Explanations
repeated instances of the verb "happen" in various forms
New Auto-Interp
Negative Logits
rica
-0.71
ilts
-0.69
stable
-0.66
urdy
-0.66
ikk
-0.64
Flavoring
-0.64
imar
-0.64
nor
-0.64
grown
-0.64
heddar
-0.63
POSITIVE LOGITS
spontaneously
0.88
everywhere
0.87
naturally
0.81
differently
0.80
catast
0.80
uate
0.79
concurrently
0.78
anywhere
0.77
nowhere
0.77
synchron
0.77
Activations Density 0.030%