INDEX
    Explanations

    sequences of underscores in code

    New Auto-Interp
    Negative Logits
     queſta
    -1.34
     Administrativna
    -1.27
    [@BOS@]
    -1.26
    <unused8>
    -1.26
    <unused52>
    -1.26
    <unused79>
    -1.26
    <unused51>
    -1.26
    <unused41>
    -1.26
    <unused16>
    -1.26
    <unused23>
    -1.26
    POSITIVE LOGITS
    0.71
    ↵↵
    0.50
    0.46
      
    0.45
     $
    0.43
     start
    0.42
    /
    0.39
    T
    0.39
     Start
    0.38
    n
    0.38
    Act Density 0.001%

    No Known Activations