INDEX
    Explanations

    catastrophic forgetting

    New Auto-Interp
    Negative Logits
    )|^{
    0.36
    0.35
    GPS
    0.35
     রাষ্ট্রদূত
    0.34
    តុ
    0.33
    PERATURE
    0.33
    âl
    0.33
     नाबाद
    0.33
    0.33
    र्चे
    0.32
    POSITIVE LOGITS
     forget
    4.38
     forgetting
    3.97
     forgot
    3.95
    forget
    3.89
     forgets
    3.88
     Forget
    3.83
    Forget
    3.78
     забы
    3.69
     forgotten
    3.64
    忘记
    3.58
    Act Density 0.044%

    No Known Activations