INDEX
Explanations
physical actions and interactions in a story
New Auto-Interp
Negative Logits
onite
-1.00
uristic
-0.91
Pros
-0.89
IJ
-0.87
hallmark
-0.86
Impl
-0.85
alike
-0.81
precursor
-0.81
pson
-0.79
transpl
-0.79
POSITIVE LOGITS
safely
1.02
Sabha
1.01
stage
0.99
robe
0.95
bed
0.93
plane
0.93
peacefully
0.91
tripod
0.90
stairs
0.89
unst
0.89
Activations Density 1.461%