INDEX
    Explanations

    references to weight and burdens

    New Auto-Interp
    Negative Logits
    باب
    -0.14
    ording
    -0.14
    è¦
    -0.14
     Album
    -0.14
     à¤ħà¤Ń
    -0.13
     nucleus
    -0.13
    jer
    -0.13
    tel
    -0.13
    aron
    -0.13
    itten
    -0.13
    POSITIVE LOGITS
     burden
    0.19
     weight
    0.17
     burdens
    0.17
     load
    0.17
     responsibilities
    0.16
    weight
    0.16
     responsibility
    0.16
     Load
    0.16
    weights
    0.16
    heimer
    0.15
    Act Density 0.224%

    No Known Activations