INDEX
    Explanations

    the term "hobby" and variants indicating leisure activities

    New Auto-Interp
    Negative Logits
    ulpt
    -0.16
    çĭIJ
    -0.15
    /Graphics
    -0.15
    ιβ
    -0.15
    äl
    -0.14
     voks
    -0.14
    ázev
    -0.14
    èįIJ
    -0.14
    ħĮ
    -0.14
    eners
    -0.14
    POSITIVE LOGITS
    iez
    0.17
    fully
    0.16
    egin
    0.16
    beer
    0.16
     highway
    0.15
    th
    0.15
    ites
    0.15
     Peng
    0.15
    se
    0.14
    idor
    0.14
    Act Density 0.002%

    No Known Activations