INDEX
Explanations
phrases related to interactions and communications between people
New Auto-Interp
Negative Logits
initially
-0.16
лиÑĪком
-0.15
COVID
-0.14
202
-0.14
ðŁĴ
-0.14
convention
-0.14
âĢį
-0.14
fully
-0.13
everything
-0.13
ðŁĴ
-0.13
POSITIVE LOGITS
ãģ¡ãĤĩãģ£ãģ¨
0.19
chances
0.18
chance
0.18
interested
0.17
interested
0.17
illet
0.17
yonel
0.16
Interested
0.16
interest
0.16
ê´Ģìĭ¬
0.16
Activations Density 0.081%