INDEX
Explanations
phrases indicating the start or conclusion of events
New Auto-Interp
Negative Logits
leen
-0.16
at
-0.15
ese
-0.15
kr
-0.14
Dawn
-0.14
UPS
-0.14
icens
-0.14
IEL
-0.14
CASE
-0.14
rub
-0.14
POSITIVE LOGITS
/end
0.25
of
0.23
end
0.20
stages
0.19
-end
0.16
Boss
0.16
/start
0.16
Of
0.16
vey
0.15
Boss
0.15
Activations Density 0.033%