INDEX
Explanations
phrases or words related to flags being raised or issues being highlighted
instances of the word "flagged" and its related forms, indicating issues or alerts
New Auto-Interp
Negative Logits
Strongh
-1.01
elf
-0.78
avers
-0.70
imates
-0.69
Telesc
-0.67
ists
-0.66
arij
-0.66
asion
-0.65
rall
-0.63
perty
-0.62
POSITIVE LOGITS
flagged
0.99
ging
0.88
lights
0.76
ged
0.73
Benz
0.70
leased
0.67
outing
0.66
uned
0.64
idon
0.64
undown
0.64
Activations Density 0.016%