INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     minic
    -0.09
     catt
    -0.09
     прис
    -0.09
     кра
    -0.09
    ollars
    -0.08
     bowl
    -0.08
     ries
    -0.08
    ailte
    -0.07
    ancy
    -0.07
     eil
    -0.07
    POSITIVE LOGITS
    fabric
    0.08
     sym
    0.08
    Fabric
    0.08
    .setdefault
    0.08
    _sym
    0.08
    Artifact
    0.08
     artifacts
    0.08
    0.08
     Feng
    0.07
     artific
    0.07
    Act Density 0.002%

    No Known Activations