INDEX
Explanations
references to familial connections and personal lineage
New Auto-Interp
Negative Logits
rather
-0.68
både
-0.66
both
-0.66
almost
-0.63
而不是
-0.59
plutôt
-0.59
istället
-0.57
rather
-0.57
almost
-0.56
somewhat
-0.56
POSITIVE LOGITS
anymore
2.02
nor
1.98
anything
1.72
anything
1.42
any
1.38
有任何
1.38
anywhere
1.35
nor
1.35
siquiera
1.31
anybody
1.29
Activations Density 3.084%