INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    𓏧
    0.78
     şöyle
    0.70
    ශ්‍ය
    0.70
    \%),
    0.69
    0.67
    LLCATS
    0.67
    0.66
     ಮಾತ
    0.66
    DBGPRINT
    0.66
    0.66
    POSITIVE LOGITS
    <eos>
    2.93
    2.06
    <start_of_image>
    1.80
    </blockquote>
    1.67
    ↵↵↵↵↵
    1.66
    ↵↵↵↵
    1.60
    ↵↵↵
    1.60
    ↵↵↵↵↵↵↵↵↵
    1.52
    ↵↵↵↵↵↵
    1.50
    ↵↵↵↵↵↵↵
    1.50
    Act Density 1.301%

    No Known Activations