INDEX
Explanations
references to daily life and routine experiences
New Auto-Interp
Negative Logits
amed
-0.17
panse
-0.15
å§ĭ
-0.15
ød
-0.14
ван
-0.14
intel
-0.14
Finished
-0.14
IBUT
-0.14
æķ´
-0.13
³
-0.13
POSITIVE LOGITS
-life
0.20
üstü
0.19
tasks
0.19
routines
0.18
activities
0.18
occurrences
0.17
ROUT
0.17
occurrence
0.17
routine
0.17
readcr
0.17
Activations Density 0.025%