INDEX
Explanations
adjectives related to negative qualities or conditions
instances of the word "weak" in various contexts
New Auto-Interp
Negative Logits
ittee
-0.75
Ready
-0.71
Newly
-0.68
McKenna
-0.68
CENT
-0.67
ICAN
-0.67
Cycling
-0.67
OGR
-0.67
Hilton
-0.67
rection
-0.65
POSITIVE LOGITS
nesses
1.25
weak
1.07
weakest
0.99
ens
0.96
luster
0.93
lings
0.92
weakness
0.86
weaker
0.86
ening
0.86
undermin
0.85
Activations Density 0.006%