INDEX
Explanations
intervene, stepped in, takes over
New Auto-Interp
Negative Logits
culminated
0.45
vergessen
0.39
menikmati
0.39
formative
0.39
culmination
0.39
burg
0.38
دیتا
0.38
ܗ
0.38
négl
0.37
Achieve
0.37
POSITIVE LOGITS
intervene
1.86
intervened
1.80
interven
1.77
intervenir
1.64
intervention
1.63
介入
1.61
stepped
1.55
swoop
1.48
intervention
1.48
intervening
1.47
Activations Density 0.042%