INDEX
Explanations
words related to actions that involve control, alteration, or influence
gerunds and present participles, indicating ongoing actions or processes
New Auto-Interp
Negative Logits
omet
-0.69
spot
-0.67
eva
-0.63
sv
-0.62
flowing
-0.61
cropped
-0.58
flown
-0.57
behind
-0.57
unc
-0.57
fur
-0.56
POSITIVE LOGITS
redients
1.02
HAM
0.87
tons
0.83
aukee
0.82
ADRA
0.78
utical
0.77
ulate
0.76
pole
0.76
enance
0.75
rarily
0.73
Activations Density 0.110%