INDEX
    Explanations

    checking for "hello" or "hi"

    New Auto-Interp
    Negative Logits
     n
    1.00
     d
    0.89
     o
    0.88
     
    0.84
     attractions
    0.82
    g
    0.78
     v
    0.77
     cooked
    0.75
     g
    0.74
     wine
    0.74
    POSITIVE LOGITS
    vaient
    0.96
     estratégica
    0.89
     személy
    0.86
     decía
    0.84
     númer
    0.82
    Thats
    0.80
    setPreferred
    0.79
     promot
    0.79
    <unused331>
    0.79
    étais
    0.78
    Act Density 0.002%

    No Known Activations