INDEX
    Explanations

    the presence of the word "Fox" in various contexts

    New Auto-Interp
    Negative Logits
    ninger
    -0.15
    \CMS
    -0.15
    bove
    -0.15
    exus
    -0.15
    еÑĢж
    -0.15
    antu
    -0.15
    ocal
    -0.15
    ignum
    -0.15
    ered
    -0.15
    abant
    -0.15
    POSITIVE LOGITS
    conn
    0.20
    xy
    0.19
    worthy
    0.18
    boro
    0.18
    croft
    0.17
    ionale
    0.16
    enberg
    0.16
    (es
    0.15
    ional
    0.15
    spl
    0.15
    Act Density 0.009%

    No Known Activations