INDEX
    Explanations

    references to hybrid concepts or items across various contexts

    New Auto-Interp
    Negative Logits
    entr
    -0.17
    ģını
    -0.16
    inters
    -0.16
    bro
    -0.16
    edir
    -0.15
    brush
    -0.15
    meal
    -0.15
    trys
    -0.15
    alet
    -0.14
    ipp
    -0.14
    POSITIVE LOGITS
    ization
    0.30
    ized
    0.29
    ity
    0.26
    izable
    0.23
    isation
    0.23
    izing
    0.22
    ated
    0.19
    ation
    0.19
    ised
    0.19
    ating
    0.19
    Act Density 0.009%

    No Known Activations