INDEX
Explanations
numerical data related to rates or measurements
Numbers with decimals
numbers followed by 'to' or 'and'
New Auto-Interp
Negative Logits
</th>
-0.64
is
-0.60
</td>
-0.59
↵↵
-0.58
&
-0.57
</h4>
-0.56
=
-0.54
<
-0.54
-0.53
</
-0.53
POSITIVE LOGITS
greateſt
0.83
pleaſure
0.79
beſt
0.77
Reſ
0.77
ㅡ
0.76
Eſ
0.76
Jefus
0.76
Anſ
0.75
anſ
0.75
myſelf
0.74
Activations Density 0.060%