INDEX
    Explanations

    references to emotional responses and personal connections

    New Auto-Interp
    Negative Logits
    iper
    -0.14
    calar
    -0.13
    کار
    -0.13
    uka
    -0.13
    ullan
    -0.12
    uned
    -0.12
     [|
    -0.12
    YPE
    -0.12
    ocup
    -0.12
    pedia
    -0.12
    POSITIVE LOGITS
     creation
    0.39
     create
    0.39
     created
    0.38
    create
    0.36
    created
    0.35
     creating
    0.34
    åĪĽå»º
    0.34
     creates
    0.33
     Ñģозд
    0.32
    -create
    0.31
    Act Density 0.008%

    No Known Activations