INDEX
Explanations
punctuation marks used to denote speech or quotations
New Auto-Interp
Negative Logits
'));
-1.13
}');
-1.08
]');
-1.03
%");
-1.02
...');
-1.02
")));
-0.96
)');
-0.96
_
-0.94
.";
-0.92
}';
-0.92
POSITIVE LOGITS
“
1.99
“
1.93
"
1.62
("1.51
‘
1.43
(“
1.43
,“
1.41
=”
1.36
="
1.34
„
1.31
Activations Density 0.519%