INDEX
Explanations
words related to historical events or topics
references to history as a subject or theme
New Auto-Interp
Negative Logits
enger
-0.73
igans
-0.67
leased
-0.67
Introduced
-0.66
achment
-0.64
¾
-0.62
weeney
-0.62
unsigned
-0.62
autions
-0.61
arling
-0.60
POSITIVE LOGITS
buffs
1.06
textbooks
1.01
lesson
0.92
orians
0.91
books
0.90
repeats
0.83
making
0.78
orian
0.76
repeating
0.76
museums
0.75
Activations Density 0.045%