INDEX
    Explanations

    conditional statements or phrases indicating hypothetical scenarios

    New Auto-Interp
    Negative Logits
    iban
    -0.16
    ieur
    -0.16
    umd
    -0.15
    оÑĢоз
    -0.15
    окол
    -0.15
    undy
    -0.14
    å°ijå¹´
    -0.14
    .nasa
    -0.14
    ataka
    -0.14
    ainer
    -0.13
    POSITIVE LOGITS
     anything
    0.35
     anyone
    0.33
     anybody
    0.31
     ever
    0.30
     memory
    0.25
    anything
    0.25
     Anyone
    0.24
     nothing
    0.24
     Anything
    0.24
    Anyone
    0.24
    Act Density 0.072%

    No Known Activations