INDEX
    Explanations

    updates or changes in information

    New Auto-Interp
    Negative Logits
     nurture
    -0.71
    sbm
    -0.67
     everyday
    -0.66
     loneliness
    -0.65
    oeuv
    -0.65
    ²¾
    -0.64
    Imagine
    -0.63
    minecraft
    -0.63
     indifferent
    -0.63
    ¥µ
    -0.62
    POSITIVE LOGITS
     corrected
    1.24
     clarification
    1.15
     clarified
    1.14
    UPDATE
    1.03
     corrections
    1.02
     revised
    1.01
     correction
    1.01
     typo
    0.99
    PDATED
    0.99
     Correction
    0.98
    Act Density 0.402%

    No Known Activations