INDEX
Explanations
phrases indicating the highest degree of comparison or superlatives
New Auto-Interp
Negative Logits
strup
-0.19
itler
-0.18
more
-0.18
.less
-0.17
более
-0.16
alone
-0.16
oled
-0.14
trys
-0.14
ivec
-0.14
æĽ´
-0.14
POSITIVE LOGITS
recent
0.23
likely
0.21
acci
0.21
pressing
0.19
Recent
0.19
arda
0.19
recent
0.19
-common
0.19
afa
0.18
important
0.18
Activations Density 0.057%