INDEX
Explanations
occurrences of the word "start" and related terms
New Auto-Interp
Negative Logits
-0.42
-0.41
↵↵
-0.40
s
-0.40
i
-0.40
приняли
-0.39
day
-0.38
alu
-0.38
t
-0.38
com
-0.38
POSITIVE LOGITS
ſta
1.09
ſelf
1.07
Houſe
1.04
pleaſure
1.04
ſtand
1.04
Reſ
1.03
raiſ
1.02
Majefty
1.00
Efq
0.97
ſche
0.96
Activations Density 0.186%