INDEX
    Explanations

    references to liberation or liberating actions

    New Auto-Interp
    Negative Logits
    par
    -0.64
     Hale
    -0.60
    colo
    -0.59
     listing
    -0.59
    Sil
    -0.58
    xus
    -0.58
    KE
    -0.58
    WAYS
    -0.58
     Grimm
    -0.57
    enegger
    -0.57
    POSITIVE LOGITS
     liberated
    1.05
     liberate
    1.02
     liberating
    0.86
    raint
    0.83
     liberation
    0.81
    selves
    0.81
     emancipation
    0.80
    veland
    0.77
    nesday
    0.75
    piration
    0.74
    Act Density 0.012%

    No Known Activations