INDEX
Explanations
phrases that indicate the overall experience or duration of events
New Auto-Interp
Negative Logits
buz
-0.16
кеÑĤ
-0.14
rial
-0.14
cleared
-0.14
clearing
-0.14
SEG
-0.14
kaar
-0.14
alık
-0.14
orig
-0.14
les
-0.14
POSITIVE LOGITS
throughout
0.21
Throughout
0.19
Throughout
0.18
dock
0.16
hots
0.15
escorte
0.15
oucher
0.15
dorf
0.15
951
0.14
gebn
0.14
Activations Density 0.088%