INDEX
Explanations
the word "since" indicating time references or continuity in context
New Auto-Interp
Negative Logits
Tween
-0.17
siguiente
-0.15
áºŃu
-0.15
ingt
-0.15
thers
-0.15
nah
-0.15
mtree
-0.14
endas
-0.14
olec
-0.14
Leban
-0.14
POSITIVE LOGITS
since
0.18
IPS
0.16
Ùħا
0.15
tant
0.15
forth
0.15
ë¶Ģ
0.14
ammers
0.14
ż
0.14
Since
0.14
Since
0.14
Activations Density 0.059%