INDEX
Explanations
temporal phrases indicating duration or time spans
New Auto-Interp
Negative Logits
ugo
-0.16
"***
-0.14
ucc
-0.14
ÙĪÙĨد
-0.14
åĸ
-0.14
Mosul
-0.14
кÑĥÑģ
-0.14
obus
-0.14
Ink
-0.13
[__
-0.13
POSITIVE LOGITS
Lab
0.16
veau
0.16
endas
0.15
resent
0.14
naveg
0.14
pheres
0.14
ätz
0.14
gezocht
0.14
ded
0.13
issing
0.13
Activations Density 0.049%