INDEX
Explanations
possessive pronouns and references to personal identity
New Auto-Interp
Negative Logits
<bos>
-0.78
térm
-0.61
anglès
-0.60
tramonto
-0.59
utveckling
-0.57
alimentaire
-0.56
vård
-0.56
elettrica
-0.56
.*")]
-0.56
elettrico
-0.56
POSITIVE LOGITS
betweenstory
0.89
…"
0.64
"]();
0.64
AccessorTable
0.63
észetes
0.63
".
0.63
..."
0.62
),"
0.60
...".
0.60
..."
0.59
Activations Density 0.127%