INDEX
Explanations
words related to experiences or emotions related to personal events and contemplation
New Auto-Interp
Negative Logits
binary
-0.76
ussian
-0.73
rad
-0.71
ustom
-0.70
endeav
-0.69
ardless
-0.68
abouts
-0.68
mbuds
-0.68
mine
-0.66
ction
-0.65
POSITIVE LOGITS
³³³
0.87
ONSORED
0.81
³³³³
0.75
Anyway
0.75
UPDATE
0.73
Until
0.73
UPDATE
0.73
Correction
0.73
Wr
0.71
Anyway
0.69
Activations Density 0.394%