INDEX
    Explanations

    words related to temporal sequences or events that occur prior to others

    New Auto-Interp
    Negative Logits
     habet
    -0.83
    __":
    
    -0.80
    ſelf
    -0.79
    KommentareTeilen
    -0.73
    gany
    -0.71
     Jefus
    -0.71
    ษัท
    -0.69
    %)$
    -0.68
    gdx
    -0.68
     følge
    -0.65
    POSITIVE LOGITS
     before
    1.96
    before
    1.82
     Before
    1.71
     BEFORE
    1.70
    BEFORE
    1.67
    Before
    1.63
     sebelum
    1.59
     befo
    1.37
     antes
    1.35
     πριν
    1.35
    Act Density 0.694%

    No Known Activations