INDEX
Explanations
phrases introducing examples or clarifications
instances of the word "say" indicating reported speech or quotation
New Auto-Interp
Negative Logits
xtap
-0.78
leeve
-0.78
aughs
-0.72
OGR
-0.71
taboola
-0.66
wald
-0.65
obal
-0.65
folios
-0.65
Compensation
-0.64
onut
-0.64
POSITIVE LOGITS
lihood
0.85
ings
0.83
goodbye
0.80
hello
0.73
ership
0.71
parts
0.67
lly
0.65
ies
0.64
volent
0.64
abouts
0.62
Activations Density 0.026%