INDEX
Explanations
words related to obstacles
references to challenges or impediments
New Auto-Interp
Negative Logits
orp
-0.87
akening
-0.74
entric
-0.73
otide
-0.72
ropolitan
-0.72
zsche
-0.72
daq
-0.71
orf
-0.70
ovy
-0.70
orks
-0.69
POSITIVE LOGITS
obstacle
1.10
hurdles
1.09
obstacles
1.07
barriers
0.97
hurdle
0.91
imped
0.90
facing
0.85
impede
0.79
insur
0.78
preventing
0.78
Activations Density 0.040%