INDEX
    Explanations

    phrases related to impossibility or limitations

    New Auto-Interp
    Negative Logits
    ery
    -0.79
    ãĥī
    -0.78
    quer
    -0.76
    rolled
    -0.75
    roller
    -0.73
    mon
    -0.71
    ura
    -0.70
    ãĥĺ
    -0.69
    ety
    -0.69
    late
    -0.67
    POSITIVE LOGITS
     knowing
    0.92
     risking
    0.92
     sacrificing
    0.88
     recourse
    0.84
     compromising
    0.82
     encountering
    0.79
     mentioning
    0.76
     regard
    0.75
     seeing
    0.70
     adequate
    0.69
    Act Density 0.040%

    No Known Activations