INDEX
Explanations
phrases related to changes or shifts in focus or direction
phrases indicating shifts or changes in direction or focus
New Auto-Interp
Negative Logits
wordpress
-0.75
enery
-0.71
bath
-0.70
wic
-0.68
COMPLE
-0.68
listed
-0.65
IFE
-0.63
Brune
-0.63
imentary
-0.62
brance
-0.61
POSITIVE LOGITS
favoring
0.84
raviolet
0.82
focus
0.80
reliance
0.76
focus
0.75
toward
0.75
essim
0.73
extremes
0.70
specialization
0.70
direction
0.70
Activations Density 0.204%