INDEX
    Explanations

    phrases indicating a starting point or entry into a topic or discussion

    New Auto-Interp
    Negative Logits
    elper
    -0.07
    lov
    -0.07
    outs
    -0.07
     dikke
    -0.07
    aji
    -0.06
    ARGIN
    -0.06
     أش
    -0.06
    ido
    -0.06
    ÏĥÏĦε
    -0.06
     Hats
    -0.06
    POSITIVE LOGITS
    punkt
    0.08
     for
    0.08
     Jacobs
    0.08
    /end
    0.07
    /Base
    0.07
    /start
    0.06
    /base
    0.06
    ¡
    0.06
    forcement
    0.06
     vfs
    0.06
    Act Density 0.010%

    No Known Activations