INDEX
    Explanations

    phrases indicating expressed thoughts or feelings

    New Auto-Interp
    Negative Logits
    <eos>
    -0.57
    ↵↵↵
    -0.56
    </blockquote>
    -0.55
    -0.55
    ↵↵
    -0.53
    ↵↵↵↵
    -0.52
    ↵↵↵↵↵
    -0.50
     contra
    -0.48
     -
    -0.47
     again
    -0.47
    POSITIVE LOGITS
    DrawerToggle
    0.81
    ]<<"
    0.78
     PopupWindow
    0.70
     houſe
    0.68
     Houſe
    0.66
     voulait
    0.64
    AndEndTag
    0.64
     SAX
    0.63
     purpoſe
    0.63
    出版年
    0.62
    Act Density 0.166%

    No Known Activations