INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     painting
    -0.06
     título
    -0.06
    colors
    -0.06
    .fold
    -0.06
    Ki
    -0.06
     whispers
    -0.06
     Dispatch
    -0.06
     kisses
    -0.06
    .Scene
    -0.06
    Lady
    -0.06
    POSITIVE LOGITS
     fish
    0.08
    Fish
    0.07
     به
    0.07
     shark
    0.07
     Connectivity
    0.06
     Lect
    0.06
     küt
    0.06
     différent
    0.06
     غذ
    0.06
     تواند
    0.06
    Act Density 0.024%

    No Known Activations