INDEX
Explanations
references to the name "Robert."
New Auto-Interp
Negative Logits
Dinah
-0.73
Jody
-0.71
سما
-0.69
yapa
-0.68
zinha
-0.66
Unwin
-0.64
yethylene
-0.63
IDC
-0.63
trin
-0.62
вью
-0.61
POSITIVE LOGITS
Robert
1.95
Robert
1.79
robert
1.66
ROBERT
1.66
robert
1.51
ROBERT
1.48
Rober
1.27
Roberts
1.20
Bob
1.09
Roberts
1.05
Activations Density 0.072%