INDEX
Explanations
instances where something is discovered or observed
phrases that indicate the discovery of something unexpected or significant
New Auto-Interp
Negative Logits
derog
-0.83
vote
-0.72
jong
-0.70
ongo
-0.69
escal
-0.69
xus
-0.67
commit
-0.67
derivative
-0.67
otine
-0.67
imo
-0.66
POSITIVE LOGITS
waking
0.76
lifeless
0.72
LESS
0.72
greeted
0.70
beautiful
0.69
tons
0.67
emptiness
0.67
behold
0.66
reapp
0.66
snowy
0.66
Activations Density 0.200%