INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     appro
    -0.71
     grades
    -0.67
     prelim
    -0.66
     repro
    -0.65
     flank
    -0.63
     grammar
    -0.62
     Avatar
    -0.62
     overlook
    -0.62
     footing
    -0.62
     spelling
    -0.61
    POSITIVE LOGITS
    We
    1.12
    It
    1.03
    Our
    1.03
    They
    1.02
    I
    1.01
    There
    1.00
    Too
    0.98
    Everything
    0.97
    What
    0.97
    BuyableInstoreAndOnline
    0.95
    Act Density 0.100%

    No Known Activations