INDEX
Explanations
mentions of the name "Kate"
references to the name "Kate."
New Auto-Interp
Negative Logits
ATIONS
-0.85
indal
-0.76
oxide
-0.76
ensical
-0.75
ENSE
-0.75
dilig
-0.74
ocre
-0.74
incinn
-0.74
tremend
-0.73
iple
-0.72
POSITIVE LOGITS
Upton
1.08
lyn
0.99
McCann
0.97
Walsh
0.88
Turner
0.87
Wins
0.85
rina
0.84
Stein
0.84
Mara
0.83
Stewart
0.82
Activations Density 0.019%