INDEX
    Explanations

    phrases indicating temporal relationships or connections

    New Auto-Interp
    Negative Logits
    featureID
    -0.91
     Roskov
    -0.82
    verwijspagina
    -0.71
     disambiguazione
    -0.69
    StructEnd
    -0.63
    InitVars
    -0.61
     فريبيس
    -0.61
     defStyleAttr
    -0.60
    AndEndTag
    -0.60
     autorytatywna
    -0.59
    POSITIVE LOGITS
     equally
    0.35
    ENOT
    0.35
     igualmente
    0.33
     yet
    0.32
    tvguidetime
    0.32
     similarly
    0.31
     necesariamente
    0.31
     Fä
    0.31
     inom
    0.30
     necessarily
    0.30
    Act Density 0.153%

    No Known Activations