INDEX
Explanations
verbs related to descent or lowering
New Auto-Interp
Negative Logits
inness
-0.17
bian
-0.16
ж
-0.15
ÑĢел
-0.15
igon
-0.15
itag
-0.15
aurant
-0.15
esters
-0.14
cles
-0.14
up
-0.14
POSITIVE LOGITS
graded
0.24
grades
0.21
wards
0.20
-down
0.20
/up
0.19
ey
0.19
sville
0.19
grading
0.19
played
0.18
ward
0.18
Activations Density 0.057%