INDEX
Explanations
sets of numerical data related to weight comparisons
New Auto-Interp
Negative Logits
itſelf
-1.03
ſever
-0.86
themſelves
-0.86
TintMode
-0.85
Anfitrión
-0.84
raiſ
-0.83
purpoſe
-0.83
ſche
-0.82
iſt
-0.80
Diſ
-0.80
POSITIVE LOGITS
0.57
</h2>
0.55
I
0.53
</em>
0.50
lui
0.48
he
0.48
Ar
0.48
://
0.46
C
0.46
↵↵
0.46
Activations Density 0.815%