INDEX
Explanations
provocation and provocative
New Auto-Interp
Negative Logits
0
1.05
ла
0.85
esian
0.79
Ayurvedic
0.78
па
0.78
ことを
0.77
CA
0.77
phosphat
0.75
MA
0.75
MIN
0.75
POSITIVE LOGITS
provocative
1.05
provoke
1.04
provocation
0.93
of
0.93
provoking
0.91
provoc
0.89
provocar
0.88
其他
0.87
provoca
0.84
provoked
0.83
Activations Density 0.005%