INDEX
Explanations
instances of specific technical tasks or procedures
New Auto-Interp
Negative Logits
aquatic
-0.17
maritime
-0.16
Estr
-0.15
ships
-0.14
arter
-0.14
seaw
-0.14
adiens
-0.14
naval
-0.14
tweet
-0.14
iah
-0.14
POSITIVE LOGITS
anchor
0.27
anch
0.25
anchor
0.23
Anchor
0.23
anchors
0.23
_anchor
0.23
anchored
0.21
-anchor
0.20
Anchor
0.20
anchors
0.20
Activations Density 0.013%