INDEX
Explanations
years or dates mentioned as part of historical events
occurrences of the word "in" and its associated temporal context
New Auto-Interp
Negative Logits
complex
-0.81
texted
-0.74
-->
-0.73
areth
-0.71
bloggers
-0.71
ï¸ı
-0.69
ror
-0.68
GoPro
-0.67
abis
-0.67
voic
-0.67
POSITIVE LOGITS
1925
1.95
1927
1.92
1926
1.91
1904
1.91
1921
1.91
1924
1.91
1923
1.89
1919
1.89
1903
1.87
1895
1.87
Activations Density 0.247%