INDEX
Explanations
comparisons and descriptions using the word "like"
phrases that describe experiences or perceptions of "what it is like" to be in various situations
New Auto-Interp
Negative Logits
arta
-0.73
Bot
-0.70
ondo
-0.63
doubt
-0.62
DK
-0.61
Mat
-0.60
ho
-0.60
Plain
-0.59
answ
-0.58
dispute
-0.58
POSITIVE LOGITS
liest
0.70
unfolding
0.70
antes
0.66
Melania
0.62
âĹ¼
0.62
thro
0.61
abouts
0.60
coloring
0.59
culturally
0.59
encountering
0.58
Activations Density 0.063%