INDEX
Explanations
instances of the word "shift" with high relevance
references to significant changes or transformations
New Auto-Interp
Negative Logits
Seym
-0.67
Godd
-0.61
lov
-0.61
ÄŁ
-0.61
ortium
-0.60
Interstitial
-0.60
gom
-0.60
Smile
-0.60
REDACTED
-0.60
CRIP
-0.59
POSITIVE LOGITS
gears
1.27
toward
1.01
blame
1.01
towards
0.97
away
0.95
sands
0.95
shift
0.90
iness
0.85
shifts
0.80
downwards
0.79
Activations Density 0.064%