INDEX
Explanations
instances of the word "starts" or related terms implying the beginning of a story or process
the beginnings of stories or narratives
New Auto-Interp
Negative Logits
mented
-0.80
illard
-0.78
aths
-0.76
ilit
-0.74
mens
-0.73
luence
-0.70
ordinary
-0.69
ada
-0.68
otropic
-0.67
itte
-0.67
POSITIVE LOGITS
anew
0.92
airing
0.82
raining
0.77
BELOW
0.73
nings
0.71
Ń·
0.69
leaking
0.68
dusk
0.68
abruptly
0.67
bothering
0.66
Activations Density 0.080%