INDEX
Explanations
impeding progress or ability
New Auto-Interp
Negative Logits
0
0.90
was
0.84
Community
0.84
che
0.78
Disorder
0.77
ys
0.76
School
0.75
by
0.73
Media
0.73
Про
0.73
POSITIVE LOGITS
impede
1.01
répond
0.98
hindering
0.98
hinders
0.97
pince
0.92
jeopardize
0.90
jeopard
0.88
hindrance
0.88
impeded
0.86
impair
0.83
Activations Density 0.128%