INDEX
    Explanations

    phrases indicating causation or reason

    phrases that introduce reasoning or justification

    New Auto-Interp
    Negative Logits
    ãĤ©
    -0.70
    zona
    -0.69
    LAB
    -0.68
    Contact
    -0.65
    istine
    -0.63
    rador
    -0.63
    oslav
    -0.62
    obyl
    -0.61
    é¾
    -0.61
    urred
    -0.59
    POSITIVE LOGITS
    give
    1.46
     example
    1.42
     instance
    1.38
     starters
    1.29
    bidden
    1.28
    cing
    1.26
    getting
    1.24
    cible
    1.18
    gotten
    1.08
     reasons
    1.04
    Act Density 0.058%

    No Known Activations