INDEX
Explanations
words related to attention or emphasis
references to a central theme or emphasis in a discussion
New Auto-Interp
Negative Logits
idden
-0.75
Gleaming
-0.75
asca
-0.74
apons
-0.72
ania
-0.71
ston
-0.70
agically
-0.65
Tale
-0.65
ccess
-0.64
jong
-0.60
POSITIVE LOGITS
starter
0.90
rite
0.88
focus
0.83
focus
0.83
focuses
0.82
focused
0.77
Focus
0.74
Goal
0.74
Focus
0.74
fulness
0.73
Activations Density 0.023%