INDEX
    Explanations

    various forms of verbs related to actions or processes

    research and ongoing work

    New Auto-Interp
    Negative Logits
     lendemain
    -0.32
    ">—
    -0.27
    MarshalTo
    -0.24
     mattina
    -0.21
     tutto
    -0.21
    machung
    -0.20
    nictwa
    -0.20
     מש
    -0.19
     respald
    -0.19
    TagMode
    -0.19
    POSITIVE LOGITS
    <unused41>
    0.87
    <unused14>
    0.87
    <unused8>
    0.87
    [@BOS@]
    0.87
    <unused79>
    0.87
    <unused23>
    0.87
    <unused51>
    0.87
    <unused28>
    0.87
    <unused3>
    0.87
    <unused16>
    0.87
    Act Density 0.039%

    No Known Activations