INDEX
Explanations
actions or instructions related to performing tasks or operations
action verbs and terms related to engagement or interaction
New Auto-Interp
Negative Logits
Seym
-0.78
Beir
-0.68
submar
-0.68
Frie
-0.67
Vaugh
-0.64
Nare
-0.64
Maver
-0.63
Niet
-0.63
Palestin
-0.62
Azerb
-0.60
POSITIVE LOGITS
guiActiveUnfocused
0.78
ments
0.78
ables
0.77
able
0.73
yon
0.72
_
0.70
backs
0.70
]
0.70
html
0.69
ername
0.68
Activations Density 0.124%