INDEX
Explanations
references to time, specifically the word "now"
New Auto-Interp
Negative Logits
412
-0.15
413
-0.15
udas
-0.14
care
-0.14
unter
-0.14
rat
-0.14
ccoli
-0.13
令
-0.13
829
-0.13
üst
-0.13
POSITIVE LOGITS
/current
0.17
ark
0.17
bie
0.15
minster
0.15
akers
0.15
tec
0.15
Testament
0.15
utex
0.15
-age
0.15
kir
0.14
Activations Density 0.019%