INDEX
    Explanations

    specific nouns and verbs associated with cultural or artistic references

    New Auto-Interp
    Negative Logits
     unable
    -0.15
    scal
    -0.14
    abilities
    -0.14
    ilities
    -0.14
     inh
    -0.14
    tiv
    -0.14
    fed
    -0.13
    ÑĤим
    -0.13
    hos
    -0.13
    ereco
    -0.13
    POSITIVE LOGITS
    GENCY
    0.15
    AGR
    0.14
    quirrel
    0.14
    ìĹĦ
    0.13
    riday
    0.13
     Summers
    0.13
    ku
    0.13
    draft
    0.13
    endale
    0.13
    ма
    0.13
    Act Density 0.479%

    No Known Activations