INDEX
Explanations
political or governmental terms and phrases
important numerical references or comparisons in a text
New Auto-Interp
Negative Logits
apons
-0.72
使
-0.64
dinand
-0.63
anooga
-0.63
>]
-0.63
orum
-0.59
æŃ¦
-0.58
zhou
-0.58
anche
-0.55
anners
-0.55
POSITIVE LOGITS
nutshell
0.71
coincidence
0.69
itch
0.65
pecul
0.59
kinda
0.57
hypocrisy
0.55
funny
0.52
squirrel
0.52
crochet
0.52
coward
0.52
Activations Density 0.869%