INDEX
    Explanations

    negative or skeptical phrases about past actions and decisions

    New Auto-Interp
    Negative Logits
     queſta
    -0.77
     témoig
    -0.75
    rungsseite
    -0.74
     ſind
    -0.73
     Efq
    -0.70
     dieſe
    -0.68
     Anſ
    -0.68
     Weiſe
    -0.67
    MemoryWarning
    -0.67
    iſchen
    -0.66
    POSITIVE LOGITS
    0.36
     it
    0.34
     T
    0.34
     phase
    0.30
     ends
    0.30
     seems
    0.29
     finit
    0.29
     pie
    0.28
     appears
    0.28
     affect
    0.28
    Act Density 0.233%

    No Known Activations