INDEX
Explanations
references to the word "our" and its variations
New Auto-Interp
Negative Logits
yf
-0.56
Dijk
-0.55
jine
-0.54
arynx
-0.54
ency
-0.53
yym
-0.52
Dimen
-0.51
childs
-0.51
ibb
-0.51
Fayette
-0.51
POSITIVE LOGITS
our
0.89
Our
0.86
Our
0.84
våre
0.83
våra
0.83
OUR
0.80
meille
0.78
nuestra
0.74
nuestros
0.72
we
0.71
Activations Density 0.085%