INDEX
Explanations
comparative phrases or expressions that indicate change over time
New Auto-Interp
Negative Logits
oders
-0.15
ivnÃŃ
-0.15
gon
-0.14
決
-0.14
bons
-0.14
oram
-0.14
Obt
-0.13
æŃ£
-0.13
òn
-0.13
ORY
-0.13
POSITIVE LOGITS
otherwise
0.19
did
0.17
than
0.17
otherwise
0.16
jinak
0.16
mania
0.15
Otherwise
0.15
Otherwise
0.14
entence
0.14
hall
0.14
Activations Density 0.038%