INDEX
    Explanations

    wen/Gwen/Owen

    New Auto-Interp
    Negative Logits
     Hungary
    -0.07
     Calories
    -0.07
    -0.06
    Tail
    -0.06
    '#
    -0.06
     kitty
    -0.06
     Assume
    -0.06
    clinical
    -0.06
     Dense
    -0.06
    сты
    -0.06
    POSITIVE LOGITS
     Gwen
    0.08
    wyn
    0.07
    win
    0.07
    wen
    0.07
     Wen
    0.07
     Owen
    0.07
     listens
    0.07
    Invocation
    0.07
    naire
    0.06
    quent
    0.06
    Act Density 0.003%

    No Known Activations