INDEX
Explanations
references to historical events or concepts
New Auto-Interp
Negative Logits
History
-0.33
histories
-0.33
history
-0.33
History
-0.32
_history
-0.30
history
-0.30
åİĨåı²
-0.30
história
-0.28
HISTORY
-0.28
historia
-0.28
POSITIVE LOGITS
æĤł
0.24
/current
0.23
accuracy
0.21
-fiction
0.21
context
0.20
Marker
0.19
/arch
0.19
figures
0.19
-marker
0.19
significance
0.18
Activations Density 0.021%