INDEX
    Explanations

    words and concepts related to capability and potential

    New Auto-Interp
    Negative Logits
    ing
    -0.21
    ed
    -0.20
    el
    -0.18
    ese
    -0.17
    ãĤ¥
    -0.16
    olt
    -0.16
    arily
    -0.16
    edb
    -0.16
    egal
    -0.16
    ø
    -0.15
    POSITIVE LOGITS
    -bodied
    0.21
    heid
    0.18
    ummings
    0.15
    lisi
    0.15
    ilty
    0.15
     Jar
    0.15
    _OVERRIDE
    0.14
    raci
    0.14
    keit
    0.14
    ë¡ľ
    0.14
    Act Density 0.161%

    No Known Activations