INDEX
Explanations
mentions or references to a person named "Jay"
references to the name "Jay."
New Auto-Interp
Negative Logits
milo
-0.89
htaking
-0.83
iture
-0.79
ancial
-0.76
cffff
-0.72
osion
-0.72
ACTED
-0.71
ãĥ¯
-0.71
ilial
-0.70
rontal
-0.70
POSITIVE LOGITS
hawks
1.18
hawk
1.10
Jay
0.98
walking
0.90
Jay
0.87
Dee
0.87
lon
0.86
jay
0.83
haw
0.82
bird
0.82
Activations Density 0.015%