INDEX
Explanations
dates in the format of years
references to numerical years, particularly those formatted as '20xx'
New Auto-Interp
Negative Logits
plom
-0.85
afort
-0.75
ablo
-0.73
pora
-0.69
kell
-0.68
hei
-0.68
uana
-0.67
etsk
-0.66
ãĥ£
-0.65
henko
-0.65
POSITIVE LOGITS
âĸĪâĸĪ
0.97
committee
0.85
th
0.85
oz
0.78
ISH
0.76
skirts
0.76
mph
0.75
SW
0.74
40
0.73
%"
0.73
Activations Density 0.056%