INDEX
Explanations
references to babies and related terms
New Auto-Interp
Negative Logits
CWE
-0.82
]').
-0.73
}`)
-0.73
}`
-0.71
}`).
-0.69
"]').
-0.67
esgue
-0.67
YourGuide
-0.66
prior
-0.65
]**
-0.64
POSITIVE LOGITS
babies
1.85
baby
1.79
Baby
1.74
BABY
1.71
baby
1.70
Baby
1.70
BABY
1.66
babies
1.54
Babies
1.54
Babies
1.44
Activations Density 0.033%