INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
<eos>
-0.52
<bos>
-0.52
&
-0.47
-
-0.46
-
-0.46
or
-0.44
and
-0.43
/
-0.41
↵↵
-0.41
-/
-0.40
POSITIVE LOGITS
myſelf
1.54
Efq
1.54
themſelves
1.47
itſelf
1.46
himſelf
1.46
Shakspeare
1.42
ſeveral
1.41
becauſe
1.41
Jefus
1.40
Monfieur
1.40
Activations Density 0.000%
No Known Activations
This feature has no known activations.