INDEX
Explanations
properties of subjects or states
New Auto-Interp
Negative Logits
i
0.58
ED
0.57
"
0.57
CO
0.54
M
0.51
IL
0.50
t
0.50
cal
0.49
is
0.49
GB
0.49
POSITIVE LOGITS
olni
0.60
𒅖
0.53
transaksi
0.52
منفی
0.51
ကု
0.51
संगिक
0.49
suku
0.48
*((*
0.48
vitth
0.48
APPE
0.48
Activations Density 0.001%