INDEX
Explanations
phrases indicating time progression or continuity
New Auto-Interp
Negative Logits
uchs
-0.16
INNER
-0.15
коз
-0.15
oppers
-0.14
isen
-0.14
ãģĿãģĨãģª
-0.14
kr
-0.14
ER
-0.13
.lu
-0.13
Inner
-0.13
POSITIVE LOGITS
buildup
0.22
run
0.21
upto
0.21
build
0.20
Run
0.19
Build
0.19
Run
0.18
up
0.18
run
0.17
-build
0.17
Activations Density 0.014%