INDEX
    Explanations

    function definitions and results

    New Auto-Interp
    Negative Logits
     λα
    1.41
     Μα
    1.38
     swoją
    1.35
     μαγγ
    1.34
     birçok
    1.34
    த்ரே
    1.34
     zahlreiche
    1.33
     regiões
    1.33
    <unused409>
    1.32
    யோ
    1.31
    POSITIVE LOGITS
    <eos>
    1.78
    ↵↵↵↵
    1.05
    ↵↵↵
    0.98
    .</
    0.95
    ↵↵↵↵↵
    0.95
    0.92
    <start_of_image>
    0.91
    ↵↵
    0.90
    ↵↵↵↵↵↵
    0.88
    。<
    0.86
    Act Density 0.024%

    No Known Activations