INDEX
Explanations
phrases indicating the act of referring or directing someone's attention to something
instances of the word "refer" and its variations
New Auto-Interp
Negative Logits
alach
-0.78
ilings
-0.64
underdog
-0.64
whiff
-0.63
heart
-0.63
olean
-0.60
sshd
-0.57
cho
-0.57
urity
-0.56
illa
-0.56
POSITIVE LOGITS
irect
0.81
entious
0.81
rers
0.78
itatively
0.77
Refer
0.72
oras
0.71
refer
0.71
ring
0.69
iously
0.69
ãĥĥ
0.67
Activations Density 0.025%