INDEX
Explanations
references to dogs and playful behavior in the context of pets
New Auto-Interp
Negative Logits
endblock
-0.68
geslacht
-0.56
matchCondition
-0.53
izarse
-0.49
ktop
-0.49
Staying
-0.49
blijven
-0.48
[]:
-0.47
griega
-0.46
służ
-0.46
POSITIVE LOGITS
care
0.87
bounding
0.82
sau
0.81
tr
0.81
Care
0.80
barre
0.79
hurt
0.78
lumber
0.77
sca
0.76
Care
0.76
Activations Density 0.317%