INDEX
Explanations
references to mistakes or errors in processes
New Auto-Interp
Negative Logits
ogne
-0.16
obs
-0.15
gili
-0.14
unreal
-0.14
itere
-0.14
nh
-0.13
Shack
-0.13
atom
-0.13
idge
-0.13
altern
-0.13
POSITIVE LOGITS
accident
0.62
Accident
0.50
accidents
0.46
accidental
0.45
ÑģлÑĥÑĩай
0.41
accidentally
0.38
acc
0.34
mistake
0.34
inadvert
0.32
Acc
0.29
Activations Density 0.244%