INDEX
Explanations
closing braces and the end of code blocks in programming syntax
New Auto-Interp
Negative Logits
itſelf
-0.84
myſelf
-0.82
himſelf
-0.80
pleaſure
-0.78
correctes
-0.76
houſe
-0.75
unſ
-0.75
themſelves
-0.74
Jefus
-0.74
raiſ
-0.73
POSITIVE LOGITS
featureID
0.82
else
0.80
else
0.74
elif
0.59
NameInMap
0.57
<eos>
0.57
Erfolge
0.53
Diweddarwch
0.52
matchCondition
0.52
็จ
0.50
Activations Density 0.075%