INDEX
Explanations
phrases related to change or transformation
New Auto-Interp
Negative Logits
enegger
-0.74
merce
-0.69
dale
-0.68
Annotations
-0.68
gress
-0.68
confir
-0.67
cffff
-0.66
MER
-0.65
separates
-0.63
erness
-0.63
POSITIVE LOGITS
tide
1.30
tables
1.22
tides
1.05
knob
1.02
Tables
1.01
corner
1.00
screws
0.95
pages
0.95
page
0.95
spotlight
0.89
Activations Density 0.063%