INDEX
    Explanations

    groups and their arguments

    New Auto-Interp
    Negative Logits
     usually
    1.70
    Usually
    1.67
    usually
    1.66
     often
    1.64
     meestal
    1.60
     Usually
    1.59
    Often
    1.57
     semua
    1.56
    通常
    1.55
     souvent
    1.53
    POSITIVE LOGITS
     downright
    1.42
     outright
    1.34
     addirittura
    0.99
     simply
    0.99
     simplemente
    0.97
     durchaus
    0.86
     legitimately
    0.84
     genuinely
    0.81
     zelfs
    0.79
     unintentional
    0.78
    Act Density 0.234%

    No Known Activations