INDEX
    Explanations

    phrases indicating the concept of searching for or naming entities

    New Auto-Interp
    Negative Logits
    aille
    -0.16
    agu
    -0.15
     she
    -0.15
    aghetti
    -0.13
    anko
    -0.13
    asje
    -0.13
    they
    -0.13
    nite
    -0.13
    agine
    -0.13
    akh
    -0.13
    POSITIVE LOGITS
     oneself
    0.47
     ourselves
    0.46
     herself
    0.45
     himself
    0.45
     themselves
    0.44
     yourself
    0.43
     ÑģебÑı
    0.41
    èĩªå·±
    0.39
     myself
    0.39
     zich
    0.37
    Act Density 0.087%

    No Known Activations