INDEX
Explanations
references to objects or entities with anthropomorphic or playful characteristics
New Auto-Interp
Negative Logits
.tk
-0.14
agh
-0.14
elor
-0.14
_bio
-0.14
eger
-0.13
tiener
-0.13
alon
-0.13
Jo
-0.13
ajo
-0.13
lemn
-0.13
POSITIVE LOGITS
-like
0.23
analogy
0.17
ÄĽn
0.16
lik
0.15
pline
0.15
ãĤ¸ãĤ¢
0.15
ëĵ¯
0.14
gren
0.14
like
0.14
.Interop
0.14
Activations Density 0.196%