INDEX
    Explanations

    references to whales and whale watching activities

    New Auto-Interp
    Negative Logits
    lemen
    -0.18
    unn
    -0.15
    ents
    -0.15
    ople
    -0.14
    bach
    -0.14
    oso
    -0.14
    ral
    -0.14
    es
    -0.14
    ifiable
    -0.14
    award
    -0.14
    POSITIVE LOGITS
    -cal
    0.15
    .boost
    0.15
    clamp
    0.14
     Perr
    0.14
    pery
    0.14
    kker
    0.13
    RF
    0.13
    ýv
    0.13
    umer
    0.13
    mpi
    0.13
    Act Density 0.005%

    No Known Activations