INDEX
Explanations
instances of military ranks and leadership qualities
New Auto-Interp
Negative Logits
pent
-0.18
lobal
-0.18
titre
-0.16
itti
-0.14
auxiliary
-0.14
Gentle
-0.14
hired
-0.13
peu
-0.13
sail
-0.13
hire
-0.13
POSITIVE LOGITS
gall
0.20
crawled
0.19
enemy
0.18
disob
0.17
disregard
0.17
crawl
0.17
bay
0.17
personally
0.16
pinned
0.16
殿
0.16
Activations Density 0.027%