INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     occur
    -1.89
     occurred
    -1.79
     occurs
    -1.78
     occurring
    -1.62
    occurs
    -1.60
     occured
    -1.47
     Occur
    -1.44
    occur
    -1.41
    Occur
    -1.39
    occurring
    -1.24
    POSITIVE LOGITS
     in
    0.90
     at
    0.67
     with
    0.65
     after
    0.64
    ,
    0.64
     from
    0.62
     as
    0.61
    .
    0.61
     on
    0.59
     under
    0.59
    Act Density 0.035%

    No Known Activations