INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
overtly
0.63
squarely
0.59
deemed
0.54
immédiatement
0.54
معظم
0.53
obviamente
0.53
asmuch
0.52
materially
0.50
Strongly
0.50
Specified
0.50
POSITIVE LOGITS
importance
2.06
significance
1.90
importance
1.75
Importance
1.67
importancia
1.63
Importance
1.55
importância
1.54
importanza
1.52
role
1.52
Significance
1.51
Activations Density 0.929%