INDEX
Explanations
terms related to people's level of interest or enthusiasm
New Auto-Interp
Negative Logits
Fail
-0.68
stacked
-0.67
hemy
-0.63
ework
-0.63
botched
-0.60
ania
-0.58
fabricated
-0.57
Saints
-0.56
nova
-0.56
orthodox
-0.56
POSITIVE LOGITS
ately
0.73
enza
0.69
ãĥĦ
0.68
atile
0.68
interested
0.67
iotics
0.66
ATIVE
0.66
rers
0.66
inery
0.65
ENC
0.65
Activations Density 0.021%