INDEX
Explanations
HTML closing tags
closing HTML tags
New Auto-Interp
Negative Logits
Gallimard
-0.52
Muda
-0.51
Krone
-0.50
choque
-0.50
Mound
-0.49
ſever
-0.48
Scout
-0.48
那里
-0.47
Huerta
-0.47
ape
-0.47
POSITIVE LOGITS
</
0.95
("</0.81
'</
0.78
"</
0.76
</
0.74
.'</
0.73
"</
0.72
'</
0.69
----</
0.69
)</
0.67
Activations Density 0.068%