INDEX
Explanations
references to investigations
references to investigations
New Auto-Interp
Negative Logits
dos
-0.71
coded
-0.70
absolute
-0.66
mindless
-0.64
shared
-0.64
birth
-0.64
piring
-0.63
generic
-0.62
perm
-0.61
noxious
-0.61
POSITIVE LOGITS
investigation
1.26
investigations
1.08
inquiry
1.03
Investigation
1.01
Inquiry
0.97
probe
0.97
investigating
0.95
investigates
0.94
probes
0.93
probing
0.92
Activations Density 0.024%