INDEX
Explanations
phrases indicating takeovers or control
occurrences of the phrase "take over."
New Auto-Interp
Negative Logits
ero
-0.66
WARE
-0.66
Detect
-0.63
Forward
-0.63
Detect
-0.63
IAS
-0.62
area
-0.62
////////////////////////////////
-0.60
osity
-0.60
Mini
-0.60
POSITIVE LOGITS
drive
0.93
tones
0.92
lord
0.85
haul
0.77
irgin
0.76
reins
0.75
stocks
0.75
lain
0.74
ran
0.72
react
0.71
Activations Density 0.018%