INDEX
Explanations
occurrences of words related to substitution and replacement
New Auto-Interp
Negative Logits
الدراسه
-0.80
myſelf
-0.79
itſelf
-0.77
Seis
-0.75
estekak
-0.75
Theſe
-0.73
becauſe
-0.72
juſt
-0.72
reaſon
-0.71
raiſ
-0.71
POSITIVE LOGITS
replacement
1.73
replacing
1.66
replace
1.65
substitute
1.65
Replacement
1.56
replaces
1.55
replaced
1.52
substitution
1.47
Replace
1.45
replacement
1.43
Activations Density 0.464%