INDEX
Explanations
written summaries or reports within a document
instances of the word "summary" and variations thereof
New Auto-Interp
Negative Logits
train
-0.73
bees
-0.72
gypt
-0.72
Wee
-0.71
duct
-0.68
warm
-0.68
charism
-0.67
hips
-0.65
Mustang
-0.64
Sea
-0.62
POSITIVE LOGITS
summar
0.90
summ
0.88
VIEW
0.83
summarize
0.82
summary
0.81
synopsis
0.77
thereof
0.76
strate
0.76
lations
0.71
krit
0.71
Activations Density 0.028%