INDEX
Explanations
mentions of the name "Sydney"
references to the term "Sydney."
New Auto-Interp
Negative Logits
ça
-0.67
deaf
-0.66
voy
-0.64
Xi
-0.63
pity
-0.60
resolve
-0.60
TAIN
-0.60
FIRE
-0.60
caution
-0.59
TRY
-0.58
POSITIVE LOGITS
roxy
1.09
roid
1.08
aniel
1.06
rox
1.03
essert
0.98
ynamic
0.93
imentary
0.93
ere
0.93
ritch
0.93
ynam
0.90
Activations Density 0.010%