INDEX
Explanations
phrases related to specific events or actions with a focus on present or recent activity
phrases that indicate important events or occurrences
New Auto-Interp
Negative Logits
@
-0.84
Recomm
-0.73
estate
-0.71
hof
-0.69
},{"-0.69
Joined
-0.68
bol
-0.68
norm
-0.66
KY
-0.66
]}
-0.66
POSITIVE LOGITS
hindsight
1.03
exceptions
0.88
caveat
0.80
caveats
0.79
exception
0.78
emph
0.75
twist
0.74
gone
0.69
aster
0.69
bang
0.69
Activations Density 0.456%