INDEX
Explanations
elements related to user interactions and navigation options on a webpage
New Auto-Interp
Negative Logits
BT
-0.32
btw
-0.32
BT
-0.26
incident
-0.25
Incident
-0.20
FY
-0.18
bt
-0.17
_BT
-0.16
incident
-0.15
weis
-0.15
POSITIVE LOGITS
pk
0.16
PK
0.15
ucz
0.14
ÑĢив
0.14
Zum
0.14
inus
0.14
rak
0.14
ÑĪе
0.14
esium
0.13
riz
0.13
Activations Density 0.074%