INDEX
Explanations
contractions: 's, 't, 're, 've
New Auto-Interp
Negative Logits
0.70
0.66
0.64
0.62
0.61
0.60
↵
0.56
0.55
0.55
0.54
POSITIVE LOGITS
<unused657>
0.79
>∕
0.78
<unused1144>
0.78
<unused160>
0.75
<unused1172>
0.73
<unused674>
0.72
⃣
0.72
휋
0.71
<unused2105>
0.71
<unused600>
0.71
Activations Density 0.083%