INDEX
    Explanations

    phrases related to winning and success

    New Auto-Interp
    Negative Logits
    uman
    -0.15
    etto
    -0.15
    alte
    -0.15
    kün
    -0.15
    resi
    -0.14
    odigo
    -0.14
    elem
    -0.14
    æģµ
    -0.14
    zman
    -0.14
    ÛĮÙĨÙĩ
    -0.14
    POSITIVE LOGITS
    nable
    0.31
     hearts
    0.29
    now
    0.24
    ning
    0.22
    ’t
    0.21
     against
    0.20
    't
    0.20
     ugly
    0.19
    kish
    0.19
    /loose
    0.19
    Act Density 0.044%

    No Known Activations