INDEX
Explanations
phrases indicating persistence or continuation in various contexts
New Auto-Interp
Negative Logits
ãĤıãģij
-0.14
nameof
-0.13
nearest
-0.13
вмÑĸ
-0.13
586
-0.13
omore
-0.13
rop
-0.13
686
-0.13
reservoir
-0.12
077
-0.12
POSITIVE LOGITS
stay
1.14
Stay
1.05
stays
1.04
stayed
1.03
stay
1.00
Stay
1.00
staying
0.99
remain
0.69
remained
0.69
remains
0.58
Activations Density 0.264%