INDEX
    Explanations

    code blocks and structures

    New Auto-Interp
    Negative Logits
    ɩ
    0.83
    <unused540>
    0.81
     legislação
    0.80
     elektromagnet
    0.79
    $<
    0.79
    0.78
     conoscere
    0.78
     subiect
    0.77
     peraturan
    0.77
    Roses
    0.77
    POSITIVE LOGITS
    ↵↵↵↵↵↵
    2.06
    ↵↵↵↵
    2.01
    ↵↵↵↵↵
    2.00
    ↵↵↵↵↵↵↵↵
    1.94
    ↵↵↵↵↵↵↵
    1.93
    ↵↵↵
    1.83
    ↵↵↵↵↵↵↵↵↵
    1.77
    ↵↵↵↵↵↵↵↵↵↵
    1.59
    ↵↵↵↵↵↵↵↵↵↵↵
    1.56
    ↵↵↵↵↵↵↵↵↵↵↵↵
    1.44
    Act Density 0.323%

    No Known Activations