INDEX
Explanations
phrases related to initiating or beginning actions
instances of the word 'start' and its variations
New Auto-Interp
Negative Logits
entirety
-0.74
obi
-0.69
illard
-0.66
phy
-0.65
ighth
-0.64
itsch
-0.64
wrought
-0.62
pedia
-0.62
ocene
-0.60
acho
-0.59
POSITIVE LOGITS
anew
1.03
nings
0.85
starting
0.79
behaving
0.76
raining
0.73
strap
0.73
ribune
0.72
rek
0.69
ners
0.69
around
0.67
Activations Density 0.071%