INDEX
Explanations
words related to scientific and technical terms like "baseline," "experiment," and "equilibrium."
terms associated with baseline or equilibrium states
New Auto-Interp
Negative Logits
phan
-0.76
acles
-0.76
Else
-0.71
atom
-0.69
Office
-0.69
uild
-0.69
Anth
-0.68
paces
-0.67
office
-0.67
odes
-0.67
POSITIVE LOGITS
baseline
0.98
phrine
0.83
sidx
0.77
jumper
0.74
outline
0.71
smoot
0.70
dummy
0.70
shif
0.67
heartbeat
0.64
level
0.63
Activations Density 0.009%