INDEX
Explanations
prepositions followed by geopolitical location mentions
New Auto-Interp
Negative Logits
neys
-0.62
reacts
-0.62
AMA
-0.61
APD
-0.60
Flavoring
-0.59
batted
-0.58
succeed
-0.56
Doodle
-0.56
greets
-0.56
interacts
-0.55
POSITIVE LOGITS
lined
0.94
escap
0.91
versible
0.87
uble
0.85
avering
0.84
bred
0.83
ked
0.82
stated
0.81
putable
0.78
lain
0.78
Activations Density 1.231%