INDEX
    Explanations

    phrases related to taking actions or measures, especially in response to issues or concerns

    New Auto-Interp
    Negative Logits
    407
    -0.16
    oran
    -0.15
    467
    -0.15
    972
    -0.15
    šov
    -0.14
    603
    -0.14
    605
    -0.14
    ãĥ³ãĥĩ
    -0.14
    orning
    -0.14
    507
    -0.14
    POSITIVE LOGITS
     steps
    0.36
     measures
    0.28
     Steps
    0.26
    steps
    0.26
    Steps
    0.24
     necessary
    0.22
    _steps
    0.22
     appropriate
    0.22
     firm
    0.21
     fir
    0.21
    Act Density 0.035%

    No Known Activations