INDEX
Explanations
occurrences of the word "have" in various forms
New Auto-Interp
Negative Logits
autorytatywna
-0.64
disambiguazione
-0.59
IntoConstraints
-0.58
-0.57
themſelves
-0.56
ujednoznacz
-0.55
utveckling
-0.55
kasarigan
-0.51
лтемелер
-0.50
homonymie
-0.49
POSITIVE LOGITS
I
0.74
I
0.69
my
0.63
myself
0.62
am
0.58
myself
0.54
guess
0.53
Myself
0.51
أنا
0.51
我没有
0.50
Activations Density 0.055%