INDEX
Explanations
references to the word "golden" and related terms
New Auto-Interp
Negative Logits
singular
-0.17
sel
-0.17
¸ı
-0.16
tiv
-0.16
ucci
-0.16
tics
-0.15
sWith
-0.15
tual
-0.15
ermann
-0.14
tings
-0.14
POSITIVE LOGITS
rod
0.35
retrie
0.32
eye
0.30
Age
0.24
age
0.24
-haired
0.24
opportunity
0.24
berg
0.24
rule
0.24
Rule
0.23
Activations Density 0.009%