INDEX
    Explanations

    words related to enhancements or improvements

    terms related to enhancements or improvements in various contexts

    New Auto-Interp
    Negative Logits
    ned
    -0.72
    zh
    -0.69
    bia
    -0.67
    xious
    -0.66
    raid
    -0.66
    zi
    -0.65
     Brotherhood
    -0.65
    ãĥ£
    -0.63
    gha
    -0.63
    ning
    -0.63
    POSITIVE LOGITS
    eatures
    1.01
    uits
    0.96
    ettings
    0.90
    ometimes
    0.89
    ktop
    0.88
    poons
    0.86
    éĹĺ
    0.86
    perty
    0.84
    hips
    0.84
     glim
    0.81
    Act Density 0.053%

    No Known Activations