INDEX
Explanations
possessive pronouns followed by people
New Auto-Interp
Negative Logits
विविध
0.76
abilities
0.72
capabilities
0.72
respective
0.71
ক্রম
0.70
ability
0.70
particular
0.69
défini
0.68
présence
0.67
вмі
0.66
POSITIVE LOGITS
tofu
0.87
jeans
0.81
Lego
0.81
grandpa
0.78
boyfriend
0.78
puppy
0.78
beer
0.77
pizza
0.77
shampoo
0.77
tacos
0.76
Activations Density 0.037%