INDEX
Explanations
mentions of significant changes or transformations
occurrences of the word "shift" indicating changes or transformations
New Auto-Interp
Negative Logits
amina
-0.67
gdala
-0.66
ecause
-0.66
ournals
-0.66
gom
-0.65
CRIP
-0.63
Smile
-0.62
jug
-0.62
Interstitial
-0.61
ORED
-0.60
POSITIVE LOGITS
gears
1.24
toward
1.02
towards
1.00
shift
0.94
away
0.94
sands
0.93
blame
0.89
downwards
0.85
shifts
0.82
shift
0.78
Activations Density 0.041%