INDEX
    Explanations

    phrases related to animal behavior and care instruction

    New Auto-Interp
    Negative Logits
     fur
    -0.16
    kat
    -0.16
    _cats
    -0.16
     kat
    -0.16
     adoption
    -0.15
     Kat
    -0.15
     Fur
    -0.15
    fur
    -0.15
    vet
    -0.15
    710
    -0.15
    POSITIVE LOGITS
     shaping
    0.21
     retrieves
    0.17
     commands
    0.17
     training
    0.17
     Sit
    0.17
    foundation
    0.17
     reward
    0.17
    associ
    0.17
     sit
    0.16
     heel
    0.16
    Act Density 0.025%

    No Known Activations