INDEX
Explanations
phrases related to giving instructions or suggestions
phrases indicating meaningful actions and their implications
New Auto-Interp
Negative Logits
tein
-0.74
Mechdragon
-0.67
onom
-0.62
ovember
-0.59
abase
-0.58
OTOS
-0.58
ensor
-0.58
sonian
-0.57
aceae
-0.55
coni
-0.55
POSITIVE LOGITS
akings
0.54
Ģ
0.51
ario
0.49
cryst
0.49
ese
0.48
ra
0.47
eal
0.47
conclud
0.46
«
0.46
eed
0.45
Activations Density 0.254%