INDEX
Explanations
unrelated or adjacent items or incidents
references to items or events that are not connected or relevant to the main topic
New Auto-Interp
Negative Logits
ilt
-0.67
®
-0.62
hol
-0.62
Press
-0.60
patience
-0.59
PRESS
-0.59
iership
-0.58
Reef
-0.58
estro
-0.57
iest
-0.57
POSITIVE LOGITS
unrelated
3.61
related
1.54
irrelevant
1.34
incompatible
1.29
innocuous
1.25
identical
1.23
unaffected
1.23
unspecified
1.21
unexplained
1.17
incidental
1.15
Activations Density 0.012%