INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    "}},
    -0.39
    "))
    
    -0.39
     oprot
    -0.38
    tabular
    -0.37
    ")));
    
    -0.36
    zeigen
    -0.36
    "])
    
    -0.36
    ]<<"
    -0.36
    OGND
    -0.35
    iastical
    -0.35
    POSITIVE LOGITS
     ?
    1.55
    ?
    1.12
     ?$
    1.06
    !?
    1.05
     ?"
    1.05
     ?,
    1.04
     ?'
    1.04
     ?...
    1.04
     ?.
    1.04
     ?-
    1.02
    Act Density 0.009%

    No Known Activations