INDEX
    Explanations

    phrases that begin with an apostrophe

    New Auto-Interp
    Negative Logits
    <bos>
    -1.48
    -0.82
    <?
    -0.79
    -0.68
    <eos>
    -0.67
    -0.67
     do
    -0.67
     don
    -0.66
     comme
    -0.66
    ണ്ട
    -0.66
    POSITIVE LOGITS
     maneu
    2.66
     increa
    2.48
     emphat
    2.46
     accla
    2.46
     affor
    2.45
     reluct
    2.43
     shenan
    2.31
     practition
    2.29
     disagre
    2.28
     inev
    2.26
    Act Density 0.044%

    No Known Activations