INDEX
Explanations
dates or years
references to years and numerical data
New Auto-Interp
Negative Logits
raltar
-0.83
awed
-0.77
rophe
-0.76
ledged
-0.75
ysical
-0.75
ramid
-0.75
otle
-0.74
rotein
-0.72
behavi
-0.72
ãĤ©
-0.69
POSITIVE LOGITS
nd
1.51
naire
0.92
80
0.87
186
0.82
ND
0.82
ipop
0.81
502
0.79
160
0.79
50
0.78
181
0.78
Activations Density 0.057%