INDEX
    Explanations

    terms related to forgetting and sponsorship

    New Auto-Interp
    Negative Logits
    #+#
    -0.79
    хьтан
    -0.75
     enfans
    -0.71
    Jeografia
    -0.69
     мәкал
    -0.65
    -0.63
    oredCriteria
    -0.63
     Italij
    -0.63
     uſed
    -0.63
    rboles
    -0.62
    POSITIVE LOGITS
     forgotten
    0.89
     forget
    0.80
     forgot
    0.72
    forget
    0.68
     forgetting
    0.65
    forgotten
    0.64
     Forget
    0.60
     Forgotten
    0.60
    Forget
    0.57
     knowledge
    0.55
    Act Density 0.255%

    No Known Activations