INDEX
Explanations
phrases related to expressing ideas or opinions in a clear and direct manner
New Auto-Interp
Negative Logits
egal
-0.89
lav
-0.70
emis
-0.68
heavily
-0.65
particularly
-0.64
ittal
-0.63
extensively
-0.63
cler
-0.62
orno
-0.61
defenders
-0.61
POSITIVE LOGITS
stated
0.79
IFIED
0.73
ify
0.72
minded
0.70
ified
0.68
guessed
0.67
stating
0.65
ername
0.65
ifying
0.64
clicking
0.63
Activations Density 0.032%