INDEX
Explanations
specific years in the 1990s
references to specific years, particularly the 1980s and 1990s
New Auto-Interp
Negative Logits
semble
-0.72
Edge
-0.68
edge
-0.66
bow
-0.66
lightsaber
-0.66
jriwal
-0.66
arent
-0.65
ndra
-0.65
holder
-0.61
hire
-0.61
POSITIVE LOGITS
ĸļ
0.97
ãĥŁ
0.78
å¹
0.75
-'
0.73
ãĥ©ãĥ³
0.72
ãĥł
0.70
ãĥĥãĥĪ
0.69
rade
0.68
census
0.68
aji
0.67
Activations Density 0.044%