INDEX
Explanations
with categories, regard, respect, emphasis
New Auto-Interp
Negative Logits
Y
0.70
Я
0.69
/*
0.67
It
0.67
Since
0.63
H
0.62
'{0.62
etc
0.62
CH
0.59
Ε
0.58
POSITIVE LOGITS
regard
1.62
regards
1.62
impunity
1.38
standing
1.22
respect
1.18
drawn
1.17
respecto
1.13
whom
1.11
emphasis
1.11
holds
1.10
Activations Density 0.464%