INDEX
Explanations
numeric values and mathematical symbols
New Auto-Interp
Negative Logits
themſelves
-0.94
purpoſe
-0.94
itſelf
-0.93
ſelves
-0.88
ſelf
-0.82
himſelf
-0.81
uſe
-0.80
myſelf
-0.79
ſever
-0.78
ſtre
-0.77
POSITIVE LOGITS
@"/
0.62
0.59
>>()
0.53
<bos>
0.53
tampak
0.43
bağlantılar
0.42
Reg
0.42
bör
0.41
T
0.41
ValueStyle
0.41
Activations Density 2.310%