INDEX
Explanations
phrases indicating location or position
New Auto-Interp
Negative Logits
%.
-0.49
。
-0.45
.
-0.44
}.
-0.44
++.
-0.42
".
-0.42
().
-0.41
+.
-0.41
].
-0.41
).
-0.40
POSITIVE LOGITS
CloseOperation
0.76
ients
0.72
autorytatywna
0.68
habits
0.67
queſta
0.66
ſelben
0.65
pires
0.64
ftagPool
0.63
GenerationType
0.60
ComVisible
0.60
Activations Density 1.200%