INDEX
Explanations
phrases related to replacement or substitution
New Auto-Interp
Head Attr Weights
0:0.09
1:0.05
2:0.26
3:0.05
4:0.03
5:0.04
6:0.12
7:0.03
8:0.03
9:0.19
10:0.03
11:0.02
Negative Logits
Kafka
-3.27
chill
-3.24
amy
-3.17
clouds
-3.14
Guinness
-3.11
zoning
-3.05
Blaz
-2.96
Booth
-2.95
voy
-2.91
secrecy
-2.89
POSITIVE LOGITS
replacements
8.26
Replacement
7.96
replacement
7.85
Repl
7.54
repl
7.10
Replace
7.04
replace
6.84
replaced
6.70
Repl
6.54
replace
6.41
Activations Density 0.023%