INDEX
Explanations
mentions of the name "Robert."
New Auto-Interp
Negative Logits
whoſe
-0.94
Unwin
-0.89
Desai
-0.87
Dinah
-0.85
Cano
-0.81
becauſe
-0.80
ſhe
-0.80
ſhould
-0.77
يتيمه
-0.77
houſe
-0.76
POSITIVE LOGITS
Robert
1.75
Robert
1.57
ROBERT
1.44
robert
1.40
Roberts
1.39
ROBERT
1.31
robert
1.30
Roberts
1.19
Rober
1.15
Bob
1.07
Activations Density 0.096%