INDEX
Explanations
phrases indicating relationships of parts to a whole
New Auto-Interp
Negative Logits
XK
-0.88
Majefty
-0.73
quæ
-0.73
Efq
-0.73
Laplacian
-0.73
pleaſure
-0.72
purpoſe
-0.71
Labrador
-0.68
fevere
-0.65
Centurion
-0.65
POSITIVE LOGITS
OutOf
1.18
outta
1.17
INTO
0.87
Into
0.81
Dooley
0.81
the
0.77
########.
0.77
into
0.77
]];
0.76
vanuit
0.73
Activations Density 0.043%