INDEX
Explanations
action-related verbs
narrative shifts or changes in a storyline
New Auto-Interp
Negative Logits
fter
-0.65
guide
-0.64
dor
-0.60
arta
-0.59
FIELD
-0.59
pta
-0.58
photo
-0.58
toggle
-0.58
IMAGES
-0.57
orio
-0.57
POSITIVE LOGITS
raining
1.03
uphill
0.68
downhill
0.66
impossible
0.64
easier
0.61
incre
0.59
spur
0.55
worthwhile
0.54
advisable
0.54
corro
0.54
Activations Density 1.110%