INDEX
    Explanations

    mentions of flight or flying

    New Auto-Interp
    Negative Logits
    illian
    -0.89
    ilitary
    -0.79
    ointed
    -0.72
    aido
    -0.71
    utory
    -0.71
    uple
    -0.69
    ribune
    -0.69
    itution
    -0.69
    orians
    -0.69
    ãĥ¤
    -0.69
    POSITIVE LOGITS
    wheel
    1.03
    leaf
    0.93
    trap
    0.93
    back
    0.89
    knit
    0.88
    cat
    0.88
    weights
    0.84
    hawks
    0.81
    weight
    0.80
    hawk
    0.80
    Act Density 5.193%

    No Known Activations