INDEX
    Explanations

    mentions of the concept of freedom or references to the word "free."

    New Auto-Interp
    Negative Logits
    older
    -0.81
     sidx
    -0.81
    ulous
    -0.77
    amel
    -0.72
     therap
    -0.69
    URRENT
    -0.67
    IPS
    -0.65
     Takeru
    -0.64
    ENTS
    -0.64
    ENTION
    -0.61
    POSITIVE LOGITS
    bies
    1.37
    bie
    1.11
    zing
    1.08
    zers
    1.08
    boot
    0.98
    zes
    0.97
     roam
    0.96
    edom
    0.94
    zer
    0.83
    ze
    0.83
    Act Density 0.056%

    No Known Activations