INDEX
Explanations
phrases related to analysis or comparison of different aspects
phrases that indicate relationships or conditions involving the word "with."
New Auto-Interp
Negative Logits
hood
-0.74
sat
-0.66
fair
-0.64
fiction
-0.62
deen
-0.61
sov
-0.60
she
-0.60
oult
-0.59
soft
-0.59
berto
-0.59
POSITIVE LOGITS
regards
1.96
regard
1.89
respect
1.56
stood
1.48
draw
1.44
standing
1.25
impunity
1.22
drawn
1.21
holding
1.15
hindsight
0.98
Activations Density 0.244%