INDEX
Explanations
numerical values
specific numeric or promotional references and mentions within documents
New Auto-Interp
Negative Logits
tumble
-0.58
stroll
-0.57
Canaver
-0.57
duplication
-0.57
downfall
-0.57
commute
-0.56
surn
-0.55
poop
-0.51
collisions
-0.51
anonymity
-0.51
POSITIVE LOGITS
ause
0.79
oren
0.71
icken
0.67
etermined
0.65
yond
0.63
renheit
0.63
ene
0.62
owship
0.62
lass
0.61
sworth
0.60
Activations Density 0.902%