INDEX
    Explanations

    phrases indicating a response or action based on a specific situation or condition

    the word "accordingly" and related phrases indicating compliance or adaptation

    New Auto-Interp
    Negative Logits
     Franks
    -0.79
    ker
    -0.70
     Mehran
    -0.70
    tein
    -0.64
    enf
    -0.63
     Bronze
    -0.60
     Roses
    -0.60
     ACA
    -0.60
    cock
    -0.59
     Cambod
    -0.58
    POSITIVE LOGITS
    itiz
    0.84
     accordingly
    0.83
     dilig
    0.77
    ãĤ¼ãĤ¦ãĤ¹
    0.77
    ersed
    0.76
    ilitary
    0.75
    Þ
    0.73
    edi
    0.72
    graded
    0.71
     behavi
    0.70
    Act Density 0.018%

    No Known Activations