INDEX
Explanations
mentions of cages, especially related to pets
references to cages
New Auto-Interp
Negative Logits
Bundes
-0.76
ACTED
-0.71
Nare
-0.68
IELD
-0.68
ibel
-0.63
olitan
-0.63
velength
-0.62
lender
-0.61
Fargo
-0.60
igate
-0.60
POSITIVE LOGITS
cage
0.99
cages
0.97
door
0.95
washer
0.83
pit
0.81
hold
0.80
Cage
0.79
mong
0.76
ModLoader
0.72
doors
0.72
Activations Density 0.024%