INDEX
Explanations
specific words that contain the substring "doub"
New Auto-Interp
Negative Logits
ĸļ
-0.69
Provided
-0.64
ODUCT
-0.63
anwhile
-0.63
Polo
-0.62
Advisor
-0.61
Occupations
-0.61
Afric
-0.61
Adv
-0.61
NETWORK
-0.61
POSITIVE LOGITS
ters
1.49
ting
1.42
ter
1.19
ly
1.14
ted
1.09
tered
1.07
ts
1.01
tering
0.96
etooth
0.95
te
0.94
Activations Density 0.029%