INDEX
Explanations
words related to medical conditions and political figures
words related to the suffix 'ine', which often denote chemical compounds or substances
New Auto-Interp
Negative Logits
awaru
-0.83
dyl
-0.80
ypes
-0.80
ODUCT
-0.78
rity
-0.73
rieved
-0.71
abies
-0.71
opher
-0.69
hips
-0.67
ivated
-0.67
POSITIVE LOGITS
phrine
1.53
lla
1.28
jad
1.28
lli
1.13
hart
1.11
llo
1.11
apple
1.04
backer
0.96
cone
0.89
ering
0.85
Activations Density 0.088%