INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     would
    -2.06
    would
    -1.91
     Would
    -1.72
    Would
    -1.67
     WOULD
    -1.55
     zouden
    -1.39
     würden
    -1.20
     wou
    -1.14
     zou
    -1.09
     wouldn
    -1.04
    POSITIVE LOGITS
     be
    1.29
     not
    0.77
     continue
    0.76
     get
    0.71
     allow
    0.70
     have
    0.69
     happen
    0.68
     vary
    0.66
     lead
    0.66
     need
    0.65
    Act Density 0.150%

    No Known Activations