INDEX
Explanations
words related to negative actions or consequences
gerunds and present participles characteristics in various contexts
New Auto-Interp
Negative Logits
fired
-0.70
talk
-0.63
cropped
-0.62
ãĥīãĥ©
-0.62
flown
-0.61
notice
-0.59
fur
-0.59
lake
-0.58
deal
-0.57
vault
-0.57
POSITIVE LOGITS
redients
1.06
ating
0.96
atable
0.90
ATING
0.89
tons
0.89
ulate
0.85
issance
0.77
ivalent
0.77
orically
0.76
Sorce
0.76
Activations Density 0.079%