INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     scheduled
    -0.76
     periodic
    -0.74
     adv
    -0.74
     sway
    -0.73
     adjud
    -0.72
     grazing
    -0.71
     unborn
    -0.70
     favor
    -0.69
     advertised
    -0.68
     footing
    -0.68
    POSITIVE LOGITS
    We
    1.19
    Especially
    1.13
    It
    1.12
    Because
    1.12
    There
    1.10
    They
    1.10
    Obviously
    1.10
    I
    1.09
    Sometimes
    1.08
    Whereas
    1.08
    Act Density 0.090%

    No Known Activations