INDEX
    Explanations

    references to the concept of freedom

    New Auto-Interp
    Negative Logits
    imum
    -0.15
    anes
    -0.15
    ilet
    -0.15
    ikal
    -0.15
    ика
    -0.14
    rim
    -0.14
     Meteor
    -0.14
    legen
    -0.14
    ocracy
    -0.14
    ilm
    -0.14
    POSITIVE LOGITS
     bott
    0.15
    enton
    0.15
    iddi
    0.15
    NT
    0.14
    adients
    0.14
    æIJº
    0.14
    .tip
    0.14
     campus
    0.14
    ocity
    0.14
    verity
    0.14
    Act Density 0.034%

    No Known Activations