INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.11
     Jefus
    -1.09
    ViewFeatures
    -1.02
     Majefty
    -1.02
     iſt
    -0.98
     houſe
    -0.98
     Houſe
    -0.96
     Eſ
    -0.93
     purpoſe
    -0.86
     myſelf
    -0.86
    POSITIVE LOGITS
    ↵↵
    0.79
     (
    0.65
    "
    0.61
    <eos>
    0.60
    0.60
    0.59
     In
    0.58
    complexContent
    0.57
    '
    0.57
    0.56
    Act Density 0.319%

    No Known Activations