INDEX
    Explanations

    phrases related to freedom and autonomy

    New Auto-Interp
    Negative Logits
    cop
    -0.17
    cro
    -0.16
    ael
    -0.15
    ardım
    -0.14
     CreateTable
    -0.14
    cycle
    -0.14
    imate
    -0.14
    .ErrorCode
    -0.14
     cyk
    -0.14
    quer
    -0.14
    POSITIVE LOGITS
    undy
    0.17
    esktop
    0.17
    zed
    0.17
    bies
    0.17
    ë¡Ń
    0.17
    /lib
    0.17
    eview
    0.16
    bie
    0.16
     captivity
    0.16
    osl
    0.16
    Act Density 0.031%

    No Known Activations