INDEX
    Explanations

    references to freedom and its different aspects, particularly relating to religion and expression

    New Auto-Interp
    Negative Logits
     Kültür
    -0.15
    ãĥ³ãĤ°
    -0.14
    aska
    -0.14
     ØŃاضر
    -0.14
    λικ
    -0.14
    BOOK
    -0.14
    urban
    -0.14
    ULD
    -0.14
    cha
    -0.14
    iled
    -0.14
    POSITIVE LOGITS
     fighters
    0.28
     Fighters
    0.28
    /lib
    0.26
     fighter
    0.24
    fighters
    0.23
     Fighter
    0.22
     loving
    0.22
    -loving
    0.21
    zes
    0.20
    fighter
    0.19
    Act Density 0.025%

    No Known Activations