INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
.),
0.43
:",
0.42
),
0.41
<(
0.40
context
0.39
:',
0.39
_),
0.38
context
0.37
>',
0.37
)).
0.37
POSITIVE LOGITS
Grandpa
0.76
Mrs
0.70
Granny
0.67
Grandma
0.65
Mama
0.65
Mama
0.61
Aunt
0.60
grandpa
0.58
Mrs
0.56
Sally
0.55
Activations Density 0.008%