INDEX
Explanations
phrases emphasizing honesty and openness in communication
New Auto-Interp
Negative Logits
brinco
-0.45
WaitGroup
-0.43
NameInMap
-0.42
不起
-0.42
Investor
-0.41
clusão
-0.40
verz
-0.40
CIT
-0.40
investor
-0.39
quer
-0.39
POSITIVE LOGITS
honesty
0.55
vérit
0.54
honest
0.54
truths
0.53
truth
0.51
хьтан
0.51
truthful
0.49
TRUTH
0.49
oredCriteria
0.49
verdades
0.48
Activations Density 0.306%