INDEX
Explanations
Phrases indicating similarity or agreement
references to the concept of sameness or doing something similar
New Auto-Interp
Negative Logits
Leilan
-0.77
hens
-0.63
Married
-0.63
frag
-0.58
uble
-0.56
GSL
-0.55
casts
-0.55
squash
-0.54
UST
-0.54
SD
-0.54
POSITIVE LOGITS
thing
0.81
fate
0.79
anecd
0.77
emetery
0.77
^^^^
0.73
retch
0.71
.
0.71
.<
0.69
.</
0.69
feat
0.69
Activations Density 0.159%