INDEX
Explanations
words representing the beginning or starting point of an action or process
instances of the word "begin" and its variations
New Auto-Interp
Negative Logits
rats
-0.70
allo
-0.69
oted
-0.69
acho
-0.68
aths
-0.67
rome
-0.67
iliary
-0.67
Sov
-0.66
ilty
-0.63
oute
-0.63
POSITIVE LOGITS
anew
1.00
ITIES
0.86
nings
0.80
OPLE
0.70
attRot
0.70
NetMessage
0.68
"$:/
0.68
igmatic
0.67
lihood
0.66
EGIN
0.64
Activations Density 0.026%