INDEX
Explanations
informative words and phrases related to official communication or documentation
instances of communication or messages received
New Auto-Interp
Negative Logits
ãĥĦ
-0.68
usra
-0.66
ogi
-0.65
zai
-0.64
ERSON
-0.60
>>\
-0.60
Politics
-0.60
Bus
-0.59
guard
-0.59
ommod
-0.59
POSITIVE LOGITS
these
1.38
ones
1.36
them
1.32
These
1.31
these
1.27
These
1.18
THESE
1.12
THEM
1.05
those
0.94
originals
0.90
Activations Density 1.404%