INDEX
Explanations
references to individuals and their current or past roles in context
New Auto-Interp
Negative Logits
ãĥ³ãĥIJ
-0.15
appoint
-0.14
tical
-0.14
´Ī
-0.14
vla
-0.14
hierarchy
-0.13
fern
-0.13
овÑĸд
-0.13
avia
-0.13
Turnbull
-0.13
POSITIVE LOGITS
serve
0.21
served
0.20
serves
0.20
chet
0.17
Serve
0.17
serving
0.17
serve
0.16
bart
0.16
Serve
0.15
suma
0.15
Activations Density 0.277%