INDEX
    Explanations

    names or titles with 'von' in them

    mentions of the name "Von."

    New Auto-Interp
    Negative Logits
    gallery
    -0.82
    uyomi
    -0.82
    taboola
    -0.82
    lighting
    -0.73
    dress
    -0.72
    inals
    -0.68
    Ĥª
    -0.68
    colour
    -0.67
    MpServer
    -0.66
    rew
    -0.66
    POSITIVE LOGITS
     Braun
    0.86
     Von
    0.84
     von
    0.76
     Frey
    0.75
     der
    0.74
     Karma
    0.74
    forcer
    0.74
    hof
    0.73
     Schwarz
    0.72
    env
    0.72
    Act Density 0.011%

    No Known Activations