INDEX
Explanations
imperative verbs related to taking action or showing something
New Auto-Interp
Negative Logits
kson
-0.73
ades
-0.71
ataka
-0.71
oldown
-0.65
alez
-0.63
istani
-0.62
iche
-0.60
aceutical
-0.59
etary
-0.59
tymology
-0.58
POSITIVE LOGITS
ered
1.10
biz
1.09
alter
1.03
manship
0.99
case
0.92
cases
0.88
rooms
0.85
downs
0.82
runners
0.80
signs
0.78
Activations Density 1.478%