INDEX
    Explanations

    negative aspects or attributes associated with concepts

    descriptions of origin or type

    New Auto-Interp
    Negative Logits
    -0.37
     the
    -0.35
     The
    -0.34
    2
    -0.32
     May
    -0.32
     to
    -0.32
     Jurí
    -0.31
     Bedürfn
    -0.31
     You
    -0.31
    1
    -0.31
    POSITIVE LOGITS
     queſta
    0.79
    <unused28>
    0.76
    [@BOS@]
    0.76
    <unused51>
    0.76
    <unused14>
    0.76
    <unused8>
    0.76
    <unused16>
    0.76
    <unused79>
    0.76
    <unused3>
    0.75
    <unused23>
    0.75
    Act Density 0.055%

    No Known Activations