INDEX
Explanations
mentions or descriptions of individuals labeled as "seasoned" or experienced in a particular field
words associated with experience and expertise
New Auto-Interp
Negative Logits
ples
-0.80
rador
-0.80
arters
-0.78
ble
-0.77
pps
-0.75
arations
-0.75
pler
-0.73
redit
-0.73
lde
-0.73
orer
-0.73
POSITIVE LOGITS
à©
0.81
senal
0.78
à¨
0.75
à¨
0.73
ãĤ¤ãĥĪ
0.72
¯¯¯¯
0.70
APH
0.70
weary
0.70
20439
0.69
inacc
0.69
Activations Density 0.023%