INDEX
Explanations
mentions of political or societal issues
instances of a specific symbol or character used in the document
New Auto-Interp
Negative Logits
sled
-0.75
cottage
-0.69
Manhattan
-0.66
coat
-0.64
unmarked
-0.63
Somerset
-0.61
cloak
-0.61
coats
-0.61
regist
-0.60
isode
-0.59
POSITIVE LOGITS
¬
1.20
º
0.92
½
0.92
¡
0.92
£
0.92
¥
0.91
ľ
0.91
¾
0.90
Ĵ
0.88
¼
0.88
Activations Density 0.329%