INDEX
Explanations
referring to position or state
New Auto-Interp
Negative Logits
単
0.50
只
0.49
તૈય
0.47
ცა
0.46
ஏற்க
0.46
첫
0.45
पटॉप
0.45
unbear
0.44
ខ្សែ
0.44
접
0.43
POSITIVE LOGITS
Belgium
0.45
作为
0.44
Established
0.42
PRESENT
0.41
Paragraph
0.41
热情
0.40
0.40
Defendant
0.39
North
0.39
نامه
0.39
Activations Density 0.001%