INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    a
    0.85
    h
    0.71
    d
    0.66
    on
    0.64
    in
    0.60
    es
    0.57
     on
    0.55
    zelfde
    0.54
    8
    0.53
     सिस्टम
    0.52
    POSITIVE LOGITS
     violin
    0.93
     Violin
    0.91
    violin
    0.86
    Viol
    0.79
     violinist
    0.73
     viol
    0.68
    🎻
    0.66
    nsk
    0.66
    viol
    0.65
     Dap
    0.65
    Act Density 0.005%

    No Known Activations