INDEX
    Explanations

    instances where something is being added or increased

    references to increases or additions

    New Auto-Interp
    Negative Logits
    NING
    -0.67
    Bey
    -0.67
    ograms
    -0.67
    bane
    -0.66
     Zel
    -0.66
     WATCHED
    -0.65
     Gram
    -0.65
     Zimmer
    -0.63
    baum
    -0.62
     Nets
    -0.62
    POSITIVE LOGITS
    ictions
    1.07
    endum
    1.01
    itionally
    1.01
    itional
    1.00
     insult
    0.89
    icted
    0.86
    itious
    0.85
    itions
    0.83
    itivity
    0.82
    ition
    0.82
    Act Density 0.044%

    No Known Activations