INDEX
Explanations
Sacred Heart and Sacré-Cœur
New Auto-Interp
Negative Logits
сала
0.41
forestry
0.39
illeri
0.38
legend
0.38
డ్డి
0.38
verte
0.37
вости
0.36
leaving
0.36
verte
0.35
справ
0.35
POSITIVE LOGITS
Heart
0.68
Heart
0.62
HEART
0.57
हार्ट
0.55
心脏
0.55
heart
0.54
Hearts
0.54
心
0.52
Sacred
0.50
hearts
0.50
Activations Density 0.005%