INDEX
Explanations
words that indicate new entities or concepts in various contexts
New Auto-Interp
Negative Logits
ComVisible
-0.72
Italij
-0.62
pareti
-0.62
saites
-0.61
#+#
-0.60
SourceChecksum
-0.60
chufe
-0.58
ViewFeatures
-0.58
varandra
-0.58
abinieri
-0.57
POSITIVE LOGITS
new
1.46
新的
1.28
nueva
1.23
new
1.21
新
1.18
nuevos
1.15
nuevas
1.14
nuevo
1.12
baru
1.12
novas
1.10
Activations Density 0.196%