INDEX
Explanations
references to different time points and changes in circumstances
New Auto-Interp
Negative Logits
alim
-0.15
ish
-0.15
wap
-0.15
eren
-0.14
ats
-0.14
beginnings
-0.14
discourse
-0.14
egin
-0.14
uma
-0.14
this
-0.14
POSITIVE LOGITS
around
0.45
round
0.40
Around
0.40
around
0.40
Around
0.39
-around
0.38
round
0.36
-round
0.33
ROUND
0.29
autour
0.29
Activations Density 0.041%