INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Cumberland
    -0.08
     passiert
    -0.07
    Ord
    -0.07
    itive
    -0.07
    -0.07
     aperture
    -0.07
     lep
    -0.07
     wrought
    -0.07
    ART
    -0.07
     πά
    -0.07
    POSITIVE LOGITS
    Towards
    0.09
     Towards
    0.09
     نحو
    0.09
     hacia
    0.09
    iented
    0.08
     towards
    0.08
     ulang
    0.08
    0.08
     zoek
    0.08
     smack
    0.08
    Act Density 0.012%

    No Known Activations