INDEX
Explanations
common words and conjunctions used frequently in texts
New Auto-Interp
Negative Logits
kö
-0.17
.metamodel
-0.16
å£
-0.15
wit
-0.14
embali
-0.14
ég
-0.14
restriction
-0.14
"./
-0.14
Morm
-0.14
aghan
-0.13
POSITIVE LOGITS
main
0.17
ahu
0.17
amat
0.16
å°½
0.15
experimentation
0.15
TESTING
0.15
Experiment
0.15
sắp
0.15
sacrifice
0.15
tester
0.15
Activations Density 0.003%