INDEX
    Explanations

    mentions of tigers and related terms

    New Auto-Interp
    Negative Logits
    abilit
    -0.18
    ategory
    -0.15
    gens
    -0.15
    tridge
    -0.15
    agma
    -0.14
    letcher
    -0.14
    ataka
    -0.14
    çĭIJ
    -0.14
    orative
    -0.14
    YTE
    -0.14
    POSITIVE LOGITS
     Woods
    0.29
     cub
    0.26
     Cub
    0.22
     Tiger
    0.22
     Claw
    0.21
     Cubs
    0.21
     claw
    0.20
     woods
    0.19
    hawk
    0.19
     Paw
    0.18
    Act Density 0.009%

    No Known Activations