INDEX
Explanations
mentions of severe damage or limitations
instances of the word "severely" indicating significant negative impacts or damage
New Auto-Interp
Negative Logits
ioch
-0.74
akeru
-0.73
eer
-0.71
yl
-0.71
amide
-0.71
ynthesis
-0.69
tein
-0.69
atu
-0.68
itu
-0.68
iens
-0.67
POSITIVE LOGITS
retarded
0.83
differentiated
0.82
severely
0.82
deteriorated
0.79
punished
0.78
locked
0.77
disabled
0.76
overweight
0.76
deterior
0.75
underestimate
0.75
Activations Density 0.016%