INDEX
Explanations
references to progress or improvement
references to progress
New Auto-Interp
Negative Logits
ENA
-0.73
Peninsula
-0.67
gorge
-0.64
duck
-0.62
ski
-0.61
resorts
-0.61
Natural
-0.60
FACE
-0.60
cust
-0.59
stal
-0.59
POSITIVE LOGITS
ivism
1.40
iveness
1.12
ivity
1.09
ives
1.07
ions
1.06
ivist
1.03
progress
0.95
ively
0.91
ments
0.90
Progress
0.89
Activations Density 0.016%