INDEX
    Explanations

    phrases that indicate instructions or actions

    New Auto-Interp
    Negative Logits
    sky
    -0.15
    nt
    -0.14
    mente
    -0.14
    nection
    -0.13
     Question
    -0.13
    where
    -0.13
     Choice
    -0.13
    _FOUND
    -0.13
    ane
    -0.13
    ize
    -0.13
    POSITIVE LOGITS
    ptal
    0.21
    ekim
    0.18
    adil
    0.18
     learn
    0.17
    ombs
    0.17
    xico
    0.17
    accom
    0.16
     further
    0.16
    Äł
    0.16
     complement
    0.15
    Act Density 0.042%

    No Known Activations