INDEX
Explanations
quoting and attributing speech
New Auto-Interp
Negative Logits
différent
0.42
cones
0.42
በሽታ
0.39
руху
0.38
habitudes
0.38
hábitos
0.37
diffé
0.36
พ
0.36
peas
0.36
seper
0.36
POSITIVE LOGITS
said
0.55
an
0.54
says
0.54
ib
0.51
en
0.50
at
0.50
it
0.50
ot
0.50
il
0.49
us
0.49
Activations Density 0.001%