INDEX
Explanations
words related to famous individuals
words associated with deception or lack of integrity
New Auto-Interp
Negative Logits
craving
-0.70
Ramadan
-0.70
reliance
-0.68
lapse
-0.68
neoc
-0.68
inished
-0.68
famine
-0.65
behold
-0.65
abase
-0.65
embargo
-0.62
POSITIVE LOGITS
cheon
0.84
hett
0.82
itzer
0.75
arios
0.72
berus
0.69
lator
0.68
iron
0.68
uli
0.68
udo
0.67
river
0.66
Activations Density 0.070%