INDEX
Explanations
references to historical events or contexts
New Auto-Interp
Negative Logits
historic
-0.22
historical
-0.22
åİĨåı²
-0.21
Historical
-0.21
_history
-0.18
historic
-0.17
histó
-0.17
historian
-0.17
Historic
-0.16
istan
-0.16
POSITIVE LOGITS
ÚĨÙĩ
0.31
buffs
0.23
æĤł
0.22
lesson
0.21
Lesson
0.21
болезни
0.20
lessons
0.18
repeating
0.18
boyunca
0.18
laden
0.18
Activations Density 0.041%