INDEX
    Explanations

    the word 'which' in varying contexts

    New Auto-Interp
    Negative Logits
     Olms
    -0.69
    ed
    -0.67
     Juneau
    -0.67
    ded
    -0.66
    cy
    -0.63
     Baton
    -0.63
     Coss
    -0.62
     Magee
    -0.61
     Folsom
    -0.60
     Hov
    -0.60
    POSITIVE LOGITS
     WHICH
    1.25
     Which
    1.24
    Which
    1.17
    which
    1.14
     which
    1.13
     wich
    1.11
    Datuak
    1.09
    ]**
    1.05
    ']))
    
    1.04
    hich
    0.96
    Act Density 0.153%

    No Known Activations