INDEX
Explanations
suspicious activities or objects
instances of the word "suspicious" and its variations
New Auto-Interp
Negative Logits
apsed
-0.78
skill
-0.75
hung
-0.75
mel
-0.73
sung
-0.73
VEL
-0.73
ffen
-0.72
produced
-0.71
arf
-0.71
borg
-0.70
POSITIVE LOGITS
ly
1.15
suspicious
0.99
suspic
0.96
LY
0.85
ively
0.83
uously
0.83
alarms
0.76
undermin
0.75
lys
0.74
detection
0.74
Activations Density 0.012%