INDEX
Explanations
past tense verbs
instances of the word "tried" indicating attempts or efforts made by individuals
New Auto-Interp
Negative Logits
head
-0.70
scribe
-0.70
itarian
-0.68
Production
-0.67
effect
-0.67
ros
-0.65
inas
-0.65
cedented
-0.63
Instruction
-0.63
Reserved
-0.63
POSITIVE LOGITS
unsuccessfully
1.47
valiant
1.01
desperately
0.86
harder
0.85
vain
0.80
contacting
0.78
repeatedly
0.77
anke
0.75
hard
0.74
xtap
0.72
Activations Density 0.057%