INDEX
Explanations
occurrences of the "<bos>" token, indicating the beginning of segments in the text
New Auto-Interp
Negative Logits
himſelf
-0.86
myſelf
-0.81
WriteTagHelper
-0.80
Espèce
-0.79
ſelves
-0.78
Jefus
-0.78
themſelves
-0.78
Efq
-0.76
Theſe
-0.76
Shakspeare
-0.76
POSITIVE LOGITS
<eos>
1.02
***!
0.57
()));
0.56
مرئيه
0.55
.*")]
0.54
)))
0.54
ⓧ
0.52
fine
0.51
())).
0.49
.
0.48
Activations Density 0.153%