INDEX
    Explanations

    elements related to commentary and opinion sections of content

    New Auto-Interp
    Negative Logits
    rok
    -0.16
    emouth
    -0.16
    combe
    -0.16
    upos
    -0.15
    onta
    -0.15
    üst
    -0.14
    olas
    -0.14
    ubat
    -0.14
    vd
    -0.14
    ched
    -0.14
    POSITIVE LOGITS
    aires
    0.24
    aries
    0.23
    ary
    0.20
    eting
    0.18
    ative
    0.18
    ators
    0.18
    ghan
    0.18
    аÑĢÑĸ
    0.16
    ariat
    0.16
    /Instruction
    0.16
    Act Density 0.031%

    No Known Activations