INDEX
Explanations
articles used to introduce nouns
New Auto-Interp
Negative Logits
stance
-0.15
.ie
-0.14
ama
-0.14
ivery
-0.14
tư
-0.14
verture
-0.14
uster
-0.14
ово
-0.14
feat
-0.13
opic
-0.13
POSITIVE LOGITS
reason
0.26
chance
0.23
saying
0.23
difference
0.21
temptation
0.20
danger
0.20
possibility
0.20
tendency
0.19
need
0.18
chances
0.18
Activations Density 0.079%