INDEX
Explanations
references to cultural significance and communal aspects in a narrative
New Auto-Interp
Negative Logits
ca
-0.16
seud
-0.16
ubl
-0.16
lots
-0.15
forgotten
-0.15
Zot
-0.14
eki
-0.14
forgot
-0.14
less
-0.14
ille
-0.14
POSITIVE LOGITS
history
0.44
history
0.36
History
0.36
History
0.33
.history
0.32
HISTORY
0.32
-history
0.32
_history
0.30
histories
0.29
Geschichte
0.29
Activations Density 0.003%