INDEX
Explanations
the word "Sydney" with high activations, potentially in different contexts
occurrences of the word "Sydney" and its variations
New Auto-Interp
Negative Logits
caution
-0.76
resolve
-0.75
perfect
-0.68
pity
-0.66
ãĥ´ãĤ¡
-0.63
OPER
-0.63
ça
-0.62
confidence
-0.61
éĹĺ
-0.61
perture
-0.61
POSITIVE LOGITS
yd
0.99
aniel
0.98
rox
0.95
roid
0.91
ynamic
0.91
essert
0.91
elta
0.91
roxy
0.90
ield
0.86
eez
0.85
Activations Density 0.004%