INDEX
Explanations
words related to legal notices or allegations
the term "allegation" and its variations
New Auto-Interp
Negative Logits
bom
-0.82
OPLE
-0.74
Wilde
-0.72
animate
-0.69
ModLoader
-0.67
ggle
-0.66
³³³³³³³³³³³³³³³³
-0.64
flix
-0.64
nown
-0.63
catch
-0.62
POSITIVE LOGITS
heny
1.36
edly
1.23
iance
1.16
ations
0.97
Alleg
0.94
iant
0.94
ict
0.90
iances
0.89
arial
0.84
arie
0.83
Activations Density 0.018%