INDEX
Explanations
instances of smiling or expressions of happiness
New Auto-Interp
Negative Logits
ModelAdmin
-0.61
béco
-0.51
admin
-0.48
feroit
-0.47
getHeight
-0.45
Admin
-0.44
feder
-0.44
TestingModule
-0.44
administrative
-0.44
admin
-0.44
POSITIVE LOGITS
smile
2.92
smiles
2.48
Smile
2.36
smiled
2.34
smile
2.22
Smile
2.17
smiling
2.09
sonrisa
2.00
Smiles
1.90
Smiling
1.84
Activations Density 0.123%