INDEX
Explanations
the word "assembly" and words associated with taste
assembly
New Auto-Interp
Negative Logits
er
-0.77
l
-0.74
new
-0.74
c
-0.73
ly
-0.72
-0.71
regular
-0.69
g
-0.68
x
-0.67
</em>
-0.66
POSITIVE LOGITS
auffi
1.27
myſelf
1.24
ſelf
1.21
ſelves
1.21
iſt
1.20
Efq
1.20
Jefus
1.20
itſelf
1.20
purpoſe
1.18
ainfi
1.16
Activations Density 2.872%