INDEX
    Explanations

    words related to animals and pets

    New Auto-Interp
    Negative Logits
    éĹĺ
    -0.78
    DERR
    -0.76
    unda
    -0.73
    artz
    -0.73
    IDER
    -0.72
     sclerosis
    -0.70
     WARN
    -0.70
    ONES
    -0.69
     seeded
    -0.68
     Shap
    -0.68
    POSITIVE LOGITS
    ertodd
    1.00
     puppies
    0.99
    fights
    0.94
     dogs
    0.94
     barking
    0.92
    fight
    0.90
    heter
    0.89
    fighting
    0.87
    riages
    0.87
     pee
    0.85
    Act Density 0.855%

    No Known Activations