INDEX
Explanations
years or dates
occurrences of the word "in" and its variations in time-related contexts
New Auto-Interp
Negative Logits
kids
-0.77
texted
-0.74
Netflix
-0.72
zbollah
-0.70
paces
-0.70
bloggers
-0.70
blogs
-0.69
hover
-0.67
ross
-0.67
wcs
-0.67
POSITIVE LOGITS
1904
1.77
1905
1.77
1909
1.76
1935
1.74
1919
1.74
1896
1.74
1897
1.74
1895
1.73
1924
1.73
1912
1.72
Activations Density 0.244%