INDEX
Explanations
categories and involved items
New Auto-Interp
Negative Logits
unfore
0.38
óny
0.37
eman
0.36
")):
0.36
曠
0.36
réelle
0.36
isalpha
0.36
﹙
0.36
かれた
0.35
());
0.35
POSITIVE LOGITS
involved
0.80
Involved
0.78
relevant
0.70
ที่จะ
0.70
implicated
0.68
involved
0.68
applicable
0.61
Relevant
0.60
relevante
0.59
suitable
0.59
Activations Density 0.026%