INDEX
Explanations
questions starting with the word "which"
typical sentence starters or phrases that introduce questions or topics
New Auto-Interp
Negative Logits
ARC
-0.73
greg
-0.72
affle
-0.69
ancock
-0.68
ourning
-0.67
Rated
-0.67
usk
-0.65
perty
-0.65
ancial
-0.65
Repl
-0.64
POSITIVE LOGITS
brings
1.52
begs
1.51
leads
1.26
reminds
1.15
means
1.15
raises
1.12
leaves
1.09
explains
1.05
translates
1.04
makes
1.03
Activations Density 0.049%