INDEX
Explanations
words related to growth or increase
terms related to growth or increased intensity
New Auto-Interp
Negative Logits
ãĥīãĥ©
-0.76
phis
-0.74
Rouge
-0.68
stice
-0.67
orneys
-0.66
hra
-0.66
odan
-0.64
oute
-0.64
ioned
-0.64
Mub
-0.63
POSITIVE LOGITS
exponentially
1.05
pains
0.94
promot
0.89
explos
0.81
accustomed
0.80
enormously
0.78
stronger
0.76
tremendously
0.75
immensely
0.75
closer
0.72
Activations Density 0.046%