INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     eloqu
    0.43
     mutat
    0.43
     agam
    0.40
     annih
    0.39
     abone
    0.39
    0.39
     bravo
    0.38
     asem
    0.38
     essen
    0.38
     essencial
    0.37
    POSITIVE LOGITS
    \_
    0.45
    <start_of_image>
    0.40
    0.37
    <0xC2>
    0.36
     F
    0.35
       
    0.34
     We
    0.32
    multicolumn
    0.32
                         
    0.31
    Means
    0.31
    Act Density 0.002%

    No Known Activations