INDEX
Explanations
references to the name "Carol" and its variations in different contexts
New Auto-Interp
Negative Logits
oise
-0.17
ovice
-0.16
ayet
-0.16
plier
-0.15
strup
-0.15
oxy
-0.15
etary
-0.15
andon
-0.15
одо
-0.14
ched
-0.14
POSITIVE LOGITS
Ann
0.19
thers
0.18
Ann
0.16
lee
0.15
ynn
0.15
les
0.15
Sue
0.15
lo
0.15
led
0.15
anne
0.15
Activations Density 0.005%