INDEX
Explanations
references to medical centers or institutions
New Auto-Interp
Negative Logits
intendent
-0.74
��
-0.71
inances
-0.68
imester
-0.68
survived
-0.66
helicop
-0.65
negotiator
-0.65
selage
-0.65
inance
-0.64
GER
-0.63
POSITIVE LOGITS
arc
0.72
spoiler
0.70
ization
0.69
sburgh
0.65
Prohibition
0.65
recated
0.64
eah
0.64
Limit
0.63
reply
0.63
Scale
0.62
Activations Density 0.016%