INDEX
Explanations
describing a state or identity
New Auto-Interp
Negative Logits
'.
0.63
.
0.63
'
0.62
này
0.58
tomto
0.55
;
0.54
.\"
0.53
นี้
0.52
this
0.51
이는
0.50
POSITIVE LOGITS
një
0.61
μια
0.60
foundational
0.60
ஒரு
0.59
ചോദ്യ
0.59
ಅವಕಾಶ
0.59
ătur
0.57
ничный
0.56
ذج
0.55
isang
0.55
Activations Density 0.020%