INDEX
Explanations
instances of the word "beginning" and phrases related to origins or introductions
New Auto-Interp
Negative Logits
URA
-0.16
ola
-0.15
ification
-0.15
å½
-0.14
acy
-0.14
ories
-0.14
abit
-0.14
егоÑĢ
-0.14
ule
-0.14
flesh
-0.14
POSITIVE LOGITS
stages
0.31
-middle
0.29
/end
0.25
nings
0.25
-stage
0.25
stage
0.21
/Foundation
0.21
steps
0.19
ling
0.19
phases
0.18
Activations Density 0.029%