INDEX
    Explanations

    references to the word "congratulations" and its variations

    New Auto-Interp
    Negative Logits
    icip
    -0.15
    ogl
    -0.15
    iera
    -0.15
    icias
    -0.15
    Ø«
    -0.14
     Cab
    -0.14
    etz
    -0.14
    icrous
    -0.14
    AsString
    -0.14
    θη
    -0.14
    POSITIVE LOGITS
    estion
    0.26
    regation
    0.21
     Cong
    0.19
    rats
    0.19
    Cong
    0.19
    ional
    0.18
    ault
    0.16
    hton
    0.16
    ado
    0.15
    uis
    0.15
    Act Density 0.015%

    No Known Activations