INDEX
    Explanations

    phrases expressing causation or explanation

    phrases introducing a clause or additional information

    New Auto-Interp
    Negative Logits
    athi
    -0.67
    Bas
    -0.65
    Behind
    -0.64
    Rog
    -0.61
    STE
    -0.60
     Burg
    -0.58
    BLE
    -0.57
    EMBER
    -0.57
    Crash
    -0.57
     Hutch
    -0.57
    POSITIVE LOGITS
     resulted
    0.88
     admittedly
    0.87
    allows
    0.84
     prompts
    0.82
     presumably
    0.82
     brings
    0.81
     fortunately
    0.78
     thankfully
    0.77
     incidentally
    0.77
    milo
    0.75
    Act Density 0.132%

    No Known Activations