INDEX
    Explanations

    references to contradiction and complexity in human relationships

    New Auto-Interp
    Negative Logits
     unfortunately
    -0.19
    Unfortunately
    -0.19
    Sadly
    -0.16
     sadly
    -0.16
     Unfortunately
    -0.16
    çIJ´
    -0.14
     Geile
    -0.14
     Dabei
    -0.14
     Sadly
    -0.14
    uze
    -0.14
    POSITIVE LOGITS
     Still
    0.68
     still
    0.64
    Still
    0.62
    Nevertheless
    0.61
     Nonetheless
    0.59
     nevertheless
    0.58
     nonetheless
    0.58
     Nevertheless
    0.58
     STILL
    0.55
    still
    0.52
    Act Density 0.348%

    No Known Activations