INDEX
Explanations
instances of quotation marks or dialogue
New Auto-Interp
Negative Logits
<bos>
-1.01
Jeografia
-0.98
хьтан
-0.95
Portale
-0.94
betweenstory
-0.91
Datuak
-0.90
Catawiki
-0.88
enterOuterAlt
-0.88
########.
-0.87
SharedCtor
-0.85
POSITIVE LOGITS
[
0.82
[
0.69
wasn
0.67
<eos>
0.63
isn
0.60
don
0.60
aren
0.59
mathrm
0.57
didn
0.57
wouldn
0.55
Activations Density 0.013%