INDEX
    Explanations

    references to pets in various contexts

    New Auto-Interp
    Negative Logits
    '));
    
    -0.80
    )");
    
    -0.78
    }>;
    -0.78
    '])->
    -0.77
    >");
    
    -0.77
    ]]
    
    -0.77
    "]);
    
    -0.76
    '])){
    
    -0.76
    ']);
    
    -0.74
     ]
    
    -0.74
    POSITIVE LOGITS
     pet
    2.37
     Pet
    2.25
     pets
    2.21
    Pet
    2.19
     Pets
    1.97
    pet
    1.95
    Pets
    1.90
     PET
    1.80
    pets
    1.73
    PET
    1.69
    Act Density 0.065%

    No Known Activations