INDEX
Explanations
mentions of advocacy or support for certain ideas or beliefs
the term "espouse" in various contexts
New Auto-Interp
Negative Logits
folk
-0.89
Reviewer
-0.83
Runner
-0.75
rooms
-0.75
ĸļ
-0.71
tons
-0.71
Halls
-0.69
banks
-0.69
bread
-0.68
gling
-0.68
POSITIVE LOGITS
resso
1.10
iscopal
1.07
anches
1.02
onse
0.95
irit
0.92
uti
0.89
aint
0.86
acers
0.84
acial
0.84
utions
0.84
Activations Density 0.016%