INDEX
    Explanations

    conditional and hypothetical phrases related to actions and expectations

    New Auto-Interp
    Negative Logits
    iese
    -0.17
    ondo
    -0.17
    oop
    -0.15
    ope
    -0.15
     Mos
    -0.15
    caler
    -0.15
    bay
    -0.15
    xda
    -0.14
    ossal
    -0.14
    incare
    -0.14
    POSITIVE LOGITS
    äºĨä¸Ģ
    0.18
     è¡ĮæĶ¿
    0.16
    thed
    0.16
    ed
    0.15
    .='
    0.15
    369
    0.14
    699
    0.14
    owanie
    0.14
    ières
    0.14
    led
    0.13
    Act Density 0.243%

    No Known Activations