INDEX
    Explanations

    instances of personal pronouns and contextually significant conjunctions

    New Auto-Interp
    Negative Logits
     queſta
    -1.09
     ویکی‌پدی
    -1.07
    <unused74>
    -0.96
     beſte
    -0.96
    <unused47>
    -0.96
    <unused8>
    -0.96
    <unused28>
    -0.96
    <unused41>
    -0.96
    <unused14>
    -0.96
    [@BOS@]
    -0.96
    POSITIVE LOGITS
    en
    0.35
    st
    0.31
    ,
    0.28
     contains
    0.26
    sp
    0.26
     "
    0.25
     includes
    0.24
    mb
    0.23
      
    0.23
    o
    0.23
    Act Density 0.035%

    No Known Activations