INDEX
Explanations
belongs in, currently running
New Auto-Interp
Negative Logits
Used
0.89
used
0.85
Used
0.85
used
0.79
digunakan
0.75
USED
0.74
!==
0.73
mistaken
0.71
用来
0.71
用於
0.67
POSITIVE LOGITS
thuộc
1.05
belongs
0.90
屬於
0.85
belong
0.81
belongs
0.81
属于
0.78
categoria
0.77
belong
0.77
Under
0.72
erled
0.71
Activations Density 0.807%