INDEX
Explanations
expressions referring to time periods in the 1990s
references to decades, particularly the 1990s and surrounding years
New Auto-Interp
Negative Logits
disemb
-0.68
recip
-0.65
spac
-0.63
aukee
-0.60
Europa
-0.59
trib
-0.58
bom
-0.56
bearer
-0.56
Devi
-0.56
Ascend
-0.55
POSITIVE LOGITS
s
1.26
ties
1.05
ixties
1.04
âĸĪâĸĪ
0.94
ugh
0.85
eties
0.80
sie
0.79
western
0.79
century
0.79
twenties
0.78
Activations Density 0.034%