INDEX
    Explanations

    JSON key-value pairs in the text

    New Auto-Interp
    Negative Logits
    ']}
    -0.80
    makeConstraints
    -0.77
     Lin
    -0.71
     In
    -0.70
     K
    -0.66
    </td>
    -0.66
    K
    -0.65
    E
    -0.65
     Sa
    -0.63
     Po
    -0.62
    POSITIVE LOGITS
     Monfieur
    1.55
     whoſe
    1.35
     Theſe
    1.34
     myſelf
    1.34
     raiſ
    1.33
     uſed
    1.31
     Jefus
    1.31
     ainfi
    1.26
     ſeveral
    1.25
     themſelves
    1.25
    Act Density 0.178%

    No Known Activations