INDEX
Explanations
mentions of the name "Carol."
mentions of specific names, particularly that of "Carol."
New Auto-Interp
Negative Logits
Dock
-0.83
Suarez
-0.82
Ud
-0.80
Zub
-0.76
Jinn
-0.75
dock
-0.74
Assange
-0.71
JP
-0.71
Ub
-0.69
Tanaka
-0.68
POSITIVE LOGITS
Carol
3.55
Dak
1.60
Rocky
1.18
Carolina
1.16
Pam
1.08
pra
1.04
CAR
1.00
Appalach
0.99
Cycl
0.98
Appalachian
0.93
Activations Density 0.037%