INDEX
    Explanations

    phrases related to providing explanations, examples, or arguments

    the word "which" in various contexts

    New Auto-Interp
    Negative Logits
    dj
    -0.79
    rolet
    -0.77
    ovo
    -0.73
    tty
    -0.73
    ondon
    -0.73
    soType
    -0.73
    bath
    -0.71
    apult
    -0.70
    rene
    -0.70
    redit
    -0.69
    POSITIVE LOGITS
    upon
    1.20
    soever
    1.11
     he
    0.93
     they
    0.92
     case
    0.88
     she
    0.85
     contestants
    0.81
     we
    0.76
     viewers
    0.75
     cases
    0.74
    Act Density 0.049%

    No Known Activations