INDEX
Explanations
instances of the word "far" and its derivatives
New Auto-Interp
Negative Logits
UAL
-0.16
gua
-0.15
chai
-0.15
ius
-0.15
ems
-0.15
dings
-0.15
ual
-0.14
making
-0.14
ernaut
-0.14
ieran
-0.14
POSITIVE LOGITS
thest
0.28
-reaching
0.23
mland
0.21
away
0.18
mlink
0.18
enough
0.17
rell
0.17
oud
0.17
thers
0.16
rier
0.16
Activations Density 0.021%