INDEX
Explanations
variations of the word "care" and its derivatives
New Auto-Interp
Negative Logits
dayName
-0.91
ottest
-0.65
req
-0.61
compuls
-0.59
DCS
-0.58
newsletters
-0.58
article
-0.58
ittered
-0.57
appe
-0.56
interf
-0.56
POSITIVE LOGITS
paren
0.94
kefeller
0.81
thro
0.80
fare
0.78
butt
0.78
ndra
0.78
eer
0.77
oop
0.74
lings
0.74
zan
0.73
Activations Density 0.004%