INDEX
    Explanations

    sequences of tabs and spacing, indicative of formatting or code structure

    code structure and punctuation

    New Auto-Interp
    Negative Logits
    +#+#
    -0.82
    featureID
    -0.75
    <unused42>
    -0.68
     utafitiHapana
    -0.68
    <unused8>
    -0.68
    [@BOS@]
    -0.68
    <unused41>
    -0.68
    <unused43>
    -0.68
    <unused80>
    -0.68
    <unused23>
    -0.68
    POSITIVE LOGITS
    ...
    0.39
     -
    0.38
    .
    0.36
    0.36
    0.36
     –
    0.35
    o
    0.34
     —
    0.34
      
    0.33
    ,
    0.32
    Act Density 0.003%

    No Known Activations