INDEX
Explanations
expressions of positivity and good sentiment
New Auto-Interp
Negative Logits
which
-0.46
verke
-0.40
które
-0.39
available
-0.38
いいのか
-0.38
którą
-0.36
といけない
-0.35
actionMode
-0.35
released
-0.34
containing
-0.34
POSITIVE LOGITS
idea
0.73
job
0.73
stuff
0.73
abestanden
0.66
reminder
0.63
luck
0.61
choice
0.60
JOB
0.60
Datuak
0.60
idéia
0.60
Activations Density 0.382%