INDEX
Explanations
mentions of specific names, particularly the name "Rebecca"
mentions of the name "Rebecca."
New Auto-Interp
Negative Logits
teenth
-0.85
actic
-0.74
awaru
-0.73
pora
-0.71
*/(
-0.69
ician
-0.68
ategic
-0.67
llular
-0.67
ibaba
-0.67
dracon
-0.67
POSITIVE LOGITS
Lopez
0.89
McKenzie
0.89
Rebecca
0.88
issance
0.86
ãĤ©
0.82
anne
0.79
Hernandez
0.78
Koen
0.76
becca
0.76
Clement
0.74
Activations Density 0.011%