INDEX
    Explanations

    phrases related to waiting or delays in processes

    New Auto-Interp
    Negative Logits
    91
    -0.18
    41
    -0.18
    94
    -0.17
    71
    -0.17
    83
    -0.16
    loat
    -0.15
    43
    -0.15
    87
    -0.15
    39
    -0.15
    79
    -0.15
    POSITIVE LOGITS
    300
    0.26
    500
    0.25
    100
    0.21
    800
    0.21
    150
    0.21
    250
    0.20
    400
    0.20
    600
    0.19
    350
    0.17
    50
    0.17
    Act Density 0.263%

    No Known Activations