INDEX
Explanations
phrases related to high stakes or potential consequences
phrases indicating the importance or significance of various matters at stake
New Auto-Interp
Negative Logits
ript
-0.75
rounder
-0.65
bor
-0.65
orously
-0.63
Pratt
-0.63
chlor
-0.63
Bore
-0.63
omin
-0.62
ithe
-0.62
pat
-0.60
POSITIVE LOGITS
FontSize
0.75
xus
0.70
TextColor
0.69
ulkan
0.68
dden
0.67
mma
0.66
ħĭ
0.65
uble
0.65
disposal
0.63
brink
0.62
Activations Density 0.031%