INDEX
Explanations
phrases and words associated with time and actions
New Auto-Interp
Negative Logits
odel
-0.17
tam
-0.16
edio
-0.16
815
-0.16
Tam
-0.15
adero
-0.15
Tam
-0.14
asco
-0.14
essian
-0.14
ieber
-0.14
POSITIVE LOGITS
każ
0.19
æ¯ı
0.19
daily
0.19
má»Ĺi
0.17
daily
0.17
every
0.17
each
0.17
æ¯ı
0.17
κάθε
0.17
each
0.16
Activations Density 0.312%