INDEX
Explanations
references to the reader or audience in a conversational context
New Auto-Interp
Negative Logits
orthy
-0.15
.nz
-0.15
REFERRED
-0.15
own
-0.15
owi
-0.14
atings
-0.14
Analyzer
-0.14
bach
-0.13
daughter
-0.13
OWN
-0.13
POSITIVE LOGITS
guys
0.69
Guys
0.52
guy
0.42
Guy
0.37
Guy
0.36
folks
0.36
gentlemen
0.30
boys
0.27
fol
0.26
’all
0.26
Activations Density 0.126%