INDEX
Explanations
phrases related to surprise or revelation
terms related to surveillance and observation
New Auto-Interp
Negative Logits
idian
-0.68
behind
-0.68
antine
-0.63
apo
-0.63
Sachs
-0.59
simultane
-0.58
ombs
-0.58
kson
-0.55
restrial
-0.55
stadt
-0.54
POSITIVE LOGITS
illance
0.74
cliffe
0.68
pecting
0.63
Suite
0.61
Spread
0.60
fit
0.58
Condition
0.58
enegger
0.57
uber
0.57
onding
0.55
Activations Density 0.123%