INDEX
    Explanations

    phrases indicating expectations or future actions

    New Auto-Interp
    Negative Logits
    ^(@)
    -0.92
     Efq
    -0.86
    adaptiveStyles
    -0.83
     Jefus
    -0.83
     cdti
    -0.79
     Houſe
    -0.77
    SBATCH
    -0.77
     للمعارف
    -0.76
    InitVars
    -0.75
     $_"
    -0.74
    POSITIVE LOGITS
     according
    1.62
    according
    1.53
    According
    1.51
     According
    1.47
    Según
    1.42
     selon
    1.35
     Selon
    1.34
     Secondo
    1.32
     Según
    1.30
    Selon
    1.30
    Act Density 0.112%

    No Known Activations