INDEX
Explanations
phrases indicating time and specific temporal references
New Auto-Interp
Negative Logits
achs
-0.17
orno
-0.17
лоп
-0.17
rike
-0.16
ص
-0.15
witter
-0.15
aly
-0.14
lish
-0.14
rite
-0.14
Herz
-0.14
POSITIVE LOGITS
weekends
0.30
source
0.19
Weekend
0.19
weekend
0.17
pains
0.15
erdem
0.15
æľ
0.15
Source
0.15
/*****************************************************************************↵
0.15
ROL
0.14
Activations Density 0.073%