INDEX
Explanations
repetitions of the word "same"
New Auto-Interp
Negative Logits
propOrder
-0.70
DiCaprio
-0.68
האם
-0.66
hört
-0.65
setupUi
-0.59
{}/-0.59
TRIBUN
-0.58
льше
-0.58
pios
-0.57
Jegyzetek
-0.56
POSITIVE LOGITS
same
2.31
Same
2.29
SAME
2.26
same
2.23
Same
2.15
SAME
2.12
samme
1.69
samma
1.67
mesma
1.55
hetzelfde
1.50
Activations Density 0.127%