INDEX
    Explanations

    temporal phrases indicating duration or time spans

    New Auto-Interp
    Negative Logits
    ugo
    -0.16
     "***
    -0.14
    ucc
    -0.14
    ÙĪÙĨد
    -0.14
    åĸ
    -0.14
     Mosul
    -0.14
    кÑĥÑģ
    -0.14
    obus
    -0.14
     Ink
    -0.13
    [__
    -0.13
    POSITIVE LOGITS
     Lab
    0.16
    veau
    0.16
    endas
    0.15
    resent
    0.14
     naveg
    0.14
    pheres
    0.14
    ätz
    0.14
     gezocht
    0.14
     ded
    0.13
    issing
    0.13
    Act Density 0.049%

    No Known Activations