INDEX
Explanations
phrases related to finality or urgency
New Auto-Interp
Negative Logits
lets
-0.15
latest
-0.15
liÄŁinin
-0.15
pong
-0.15
олов
-0.15
posite
-0.15
кÑĢа
-0.14
latest
-0.14
زÙĦ
-0.14
æľĢæĸ°
-0.14
POSITIVE LOGITS
gas
0.35
remaining
0.31
hur
0.28
Gas
0.28
remaining
0.27
Remaining
0.26
Gas
0.26
gas
0.25
_gas
0.24
ditch
0.24
Activations Density 0.044%