INDEX
Explanations
determiners followed by nouns
New Auto-Interp
Negative Logits
tricky
0.90
confus
0.82
ρεία
0.80
confused
0.79
Convenient
0.78
relevante
0.77
interessant
0.77
convenient
0.76
smelly
0.75
confusing
0.75
POSITIVE LOGITS
words
0.78
每一
0.71
its
0.69
imparted
0.68
每一个
0.68
workmanship
0.67
remarks
0.66
一个人
0.66
argumentos
0.65
رد
0.65
Activations Density 0.398%