INDEX
Explanations
phrases indicating a large number of people showing interest, involvement, or shared experience in a context
occurrences of the word "have"
New Auto-Interp
Negative Logits
Anyway
-0.65
Apart
-0.62
most
-0.61
vi
-0.59
anny
-0.57
Niet
-0.56
NOR
-0.56
TG
-0.56
advertising
-0.55
Pair
-0.55
POSITIVE LOGITS
been
1.43
been
1.29
gotten
1.14
undergone
1.12
gone
1.10
fallen
1.09
arisen
1.03
risen
1.01
Been
1.00
come
0.97
Activations Density 0.195%