INDEX
Explanations
the word "refer"
instances of the word "refer" and its variations
New Auto-Interp
Negative Logits
alach
-0.78
whiff
-0.65
underdog
-0.61
avorite
-0.61
iller
-0.60
onel
-0.58
oppable
-0.58
hov
-0.58
cape
-0.58
olean
-0.57
POSITIVE LOGITS
rers
0.86
irect
0.85
entious
0.84
itatively
0.79
ãĥĥ
0.77
rences
0.76
iously
0.75
Refer
0.72
oras
0.71
minist
0.71
Activations Density 0.017%