INDEX
Explanations
words related to legal language or implications
occurrences of the word "guilty"
New Auto-Interp
Negative Logits
cling
-0.77
croft
-0.75
Spectrum
-0.73
ŃĶ
-0.73
spect
-0.67
riad
-0.66
hower
-0.65
HAEL
-0.65
cycle
-0.62
rums
-0.61
POSITIVE LOGITS
idelines
1.12
pta
1.09
errilla
1.07
vernment
1.07
arding
1.07
ilty
1.05
arant
1.03
cci
1.01
inea
0.98
bernatorial
0.95
Activations Density 0.015%