INDEX
Explanations
sentences ending in a colon indicating a direct quote or statement
instances of dialogue or quotes in the text
New Auto-Interp
Negative Logits
aban
-0.60
iage
-0.55
idas
-0.55
irements
-0.55
irable
-0.55
roit
-0.54
ible
-0.54
vable
-0.54
ADS
-0.54
ults
-0.53
POSITIVE LOGITS
"'
1.45
"â̦
1.34
"...
1.29
"
1.28
""
1.27
"[
1.26
"@
1.24
"(
1.20
"#
1.20
'[
1.19
Activations Density 0.082%