INDEX
Explanations
words or phrases related to likes, such as "liked" or "likes"
the presence of segment markers in the text
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.70
Rim
-0.68
Lit
-0.65
spirited
-0.63
Satanic
-0.61
Labor
-0.61
compensated
-0.60
éĹĺ
-0.58
ħĭ
-0.58
Debor
-0.57
POSITIVE LOGITS
ayers
1.35
ocated
1.30
ateral
1.30
oyal
1.28
aptop
1.27
ifestyle
1.27
ibrarian
1.25
ongevity
1.24
aying
1.22
apsed
1.21
Activations Density 0.034%