INDEX
Explanations
references to specific decades, particularly the 1960s, 1970s, 1980s, and 1990s
New Auto-Interp
Negative Logits
venge
-0.65
ngth
-0.61
terday
-0.61
aukee
-0.61
masc
-0.60
Tickets
-0.59
cam
-0.57
semble
-0.56
cham
-0.56
netflix
-0.56
POSITIVE LOGITS
s
1.19
-'
0.85
sie
0.85
eties
0.81
ixties
0.81
ies
0.80
ties
0.79
enthal
0.78
era
0.78
sburg
0.76
Activations Density 0.029%