INDEX
Explanations
phrases emphasizing the word "such."
New Auto-Interp
Negative Logits
ensch
-0.20
acin
-0.15
such
-0.15
both
-0.15
gram
-0.14
isches
-0.14
aign
-0.14
BOTH
-0.14
whatever
-0.14
efeller
-0.14
POSITIVE LOGITS
like
0.29
thing
0.27
things
0.25
coisa
0.21
itra
0.20
обÑĢазом
0.19
ness
0.19
thing
0.19
-like
0.19
curity
0.18
Activations Density 0.055%