INDEX
    Explanations

    only to a negative outcome

    New Auto-Interp
    Negative Logits
     OK
    0.48
     includes
    0.46
     overrides
    0.44
     assorted
    0.44
     correlates
    0.42
     Interpre
    0.42
     passages
    0.41
     包括
    0.41
     asum
    0.41
    iseur
    0.40
    POSITIVE LOGITS
     να
    0.71
     kemudian
    0.68
    然后
    0.60
    后来
    0.58
     затем
    0.57
     ثم
    0.56
     потім
    0.56
     finally
    0.55
     akhirnya
    0.54
    後來
    0.53
    Act Density 0.010%

    No Known Activations