INDEX
Explanations
terms related to being user-friendly or suitable for a specific audience
terms related to friendliness
New Auto-Interp
Negative Logits
outl
-0.66
IV
-0.64
ax
-0.63
principal
-0.63
subdiv
-0.63
âĢ
-0.63
[...]
-0.62
marrow
-0.62
ded
-0.61
transc
-0.61
POSITIVE LOGITS
friendly
3.86
Friendly
2.18
friendly
2.10
Friend
1.99
friend
1.59
safe
1.39
riend
1.27
loving
1.27
oriented
1.25
happy
1.24
Activations Density 0.022%