INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    .”
    -1.00
    ,”
    -0.96
    ”.
    -0.93
    ?”
    -0.92
    =”
    -0.91
    -0.91
    ”,
    -0.89
    )=$
    -0.88
    !”
    -0.86
    ’.
    -0.85
    POSITIVE LOGITS
    <bos>
    7.00
     intersper
    2.77
     encomp
    2.75
     fuf
    2.73
     guarante
    2.71
     maneu
    2.70
     fta
    2.69
     emphat
    2.67
     increa
    2.61
     ftu
    2.58
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.