INDEX
    Explanations

    indicating purpose or benefit

    New Auto-Interp
    Negative Logits
    i
    0.58
    ל
    0.46
    י
    0.45
    OF
    0.42
    0.42
    0.42
    0.40
    0.40
    are
    0.38
    の通販
    0.38
    POSITIVE LOGITS
    0.52
    0.50
     🌱
    0.49
    ۩
    0.49
    😧
    0.48
     to
    0.47
     číslo
    0.47
    0.46
     de
    0.46
    ong
    0.46
    Act Density 0.047%

    No Known Activations