INDEX
Explanations
phrases indicating a sense of similarity or shared traits among individuals
New Auto-Interp
Negative Logits
शन
-0.15
ált
-0.15
dy
-0.15
like
-0.14
otty
-0.14
locate
-0.14
roe
-0.14
ÙĬج
-0.14
ogy
-0.14
roots
-0.14
POSITIVE LOGITS
-minded
0.42
minded
0.42
WISE
0.27
minds
0.26
Minds
0.26
able
0.25
hood
0.20
unto
0.20
inded
0.19
mind
0.19
Activations Density 0.031%