INDEX
Explanations
phrases indicating a strong suggestion or clue
phrases related to indicators or signs of trends or events
New Auto-Interp
Negative Logits
hibited
-0.70
morph
-0.66
tz
-0.66
married
-0.60
oos
-0.60
space
-0.58
onyms
-0.58
practiced
-0.57
osh
-0.57
yright
-0.57
POSITIVE LOGITS
indication
1.01
hint
0.89
hints
0.82
indications
0.81
Signs
0.78
glim
0.78
clue
0.76
signs
0.75
signal
0.75
glimpse
0.75
Activations Density 0.086%