INDEX
    Explanations

    phrases indicating collective experiences and emotions

    New Auto-Interp
    Negative Logits
    ạo
    -0.16
    rian
    -0.16
     anywhere
    -0.15
     never
    -0.15
    afone
    -0.14
    άÏģÏĩ
    -0.14
    aise
    -0.14
    createClass
    -0.14
     Restricted
    -0.14
    urge
    -0.14
    POSITIVE LOGITS
    except
    0.25
    Except
    0.24
     Except
    0.23
     except
    0.22
     alike
    0.20
     Everyone
    0.20
    _except
    0.19
    ayed
    0.19
    Everyone
    0.18
     Everybody
    0.18
    Act Density 0.138%

    No Known Activations