INDEX
Explanations
phrases related to thoughts, expectations, and imaginations
phrases indicating assumptions or common perceptions
New Auto-Interp
Negative Logits
Miko
-0.65
âĹı
-0.64
Pruitt
-0.64
Phelps
-0.60
WWF
-0.60
ritz
-0.59
mediation
-0.59
disarm
-0.57
Katy
-0.57
Central
-0.56
POSITIVE LOGITS
Downloadha
0.84
ault
0.75
iar
0.72
eeds
0.71
heit
0.69
uma
0.67
um
0.66
charism
0.65
otherwise
0.64
charact
0.63
Activations Density 0.185%