INDEX
Explanations
phrases related to cages or enclosures
mentions of cages, particularly in relation to animals and their enclosures
New Auto-Interp
Negative Logits
ACTED
-0.72
Bundes
-0.65
Done
-0.61
evin
-0.60
ibel
-0.60
vironment
-0.59
lender
-0.58
FM
-0.57
ITNESS
-0.57
UD
-0.56
POSITIVE LOGITS
cage
1.28
cages
1.17
Cage
1.05
door
0.81
pit
0.81
washer
0.77
Rampage
0.75
eers
0.74
Frames
0.74
Handler
0.73
Activations Density 0.007%