INDEX
Explanations
references to significant historical figures and events
New Auto-Interp
Negative Logits
erge
-0.19
AttributeValue
-0.15
uxtap
-0.14
Das
-0.14
abbr
-0.14
_hierarchy
-0.14
rosse
-0.14
induction
-0.14
ue
-0.14
.resume
-0.14
POSITIVE LOGITS
Autor
0.19
Autor
0.18
Adv
0.18
journal
0.18
Initi
0.18
Journal
0.17
journals
0.17
ige
0.16
pension
0.15
stell
0.15
Activations Density 0.076%