INDEX
Explanations
quotations or phrases enclosed in quotation marks
instances of quotes in the text
New Auto-Interp
Negative Logits
ggles
-0.87
ichick
-0.83
gart
-0.81
estate
-0.75
appers
-0.72
adh
-0.72
icz
-0.72
ipeg
-0.71
ibaba
-0.69
ntil
-0.68
POSITIVE LOGITS
quote
1.09
quotes
1.06
quotation
0.87
quoting
0.86
phrases
0.86
quoted
0.82
quotations
0.81
attributed
0.80
excerpts
0.78
phrase
0.77
Activations Density 0.022%