INDEX
    Explanations

    phrases indicating a conclusion or finality

    New Auto-Interp
    Negative Logits
    elow
    -0.07
    (Op
    -0.07
    izzo
    -0.06
    à¹īาà¸ĩ
    -0.06
    nost
    -0.06
    _Execute
    -0.06
    ******↵↵
    -0.06
     scal
    -0.06
    ossier
    -0.06
    γκα
    -0.06
    POSITIVE LOGITS
     over
    0.33
     Over
    0.23
    Over
    0.23
    over
    0.23
    _over
    0.22
     OVER
    0.22
    -over
    0.20
     sobre
    0.19
    è¿ĩ
    0.18
    .over
    0.18
    Act Density 0.057%

    No Known Activations