INDEX
Explanations
terms related to motivation and motivational concepts
New Auto-Interp
Negative Logits
hole
-0.17
hone
-0.17
ESIS
-0.16
atura
-0.16
ialog
-0.16
halt
-0.16
ITES
-0.15
eda
-0.15
hos
-0.15
oje
-0.15
POSITIVE LOGITS
ivation
0.37
ivated
0.35
ley
0.33
ivating
0.31
oring
0.30
own
0.29
ivate
0.29
ives
0.25
tram
0.25
ifs
0.25
Activations Density 0.010%