INDEX
    Explanations

    conditional phrases or instances of the word "when."

    New Auto-Interp
    Negative Logits
     Bab
    -0.53
    ib
    -0.53
    uro
    -0.52
     Bob
    -0.51
    bab
    -0.51
     Bag
    -0.50
    udo
    -0.50
     Band
    -0.49
    cy
    -0.49
    ache
    -0.49
    POSITIVE LOGITS
     when
    2.34
    when
    2.16
     cuando
    2.13
     quando
    2.08
     WHEN
    1.85
    cuando
    1.84
     kiedy
    1.84
     когда
    1.83
     wanneer
    1.80
     όταν
    1.74
    Act Density 0.309%

    No Known Activations