INDEX
Explanations
instances of the word "have" and its variations closely associated with possession or existence
New Auto-Interp
Negative Logits
Credito
-0.74
pmatrix
-0.73
Trades
-0.69
nitus
-0.69
cedes
-0.67
Mémoires
-0.67
Amis
-0.66
Nils
-0.66
açıklama
-0.65
uitton
-0.65
POSITIVE LOGITS
having
0.87
HAVE
0.77
have
0.73
Having
0.73
Have
0.72
having
0.71
HAV
0.71
Having
0.67
haya
0.65
bianche
0.64
Activations Density 0.101%