INDEX
    Explanations

    question words and modals

    New Auto-Interp
    Negative Logits
    ÂĢÂĢ
    -0.14
    ******č\n
    -0.14
    ¶Į
    -0.14
    .Formatter
    -0.12
    įng
    -0.12
    .Currency
    -0.11
    ĥ½
    -0.10
    %č\n
    -0.10
    ıa
    -0.10
    ¨ë¶Ģ
    -0.10
    POSITIVE LOGITS
     however
    0.10
     [
    0.08
     therefore
    0.08
     @
    0.08
     roughly
    0.08
     then
    0.08
     ~
    0.08
     
    0.08
     *
    0.08
     enim
    0.08
    Act Density 1.036%

    No Known Activations