INDEX
Explanations
action verbs in the past tense
verbs or phrases indicating ongoing actions and states
New Auto-Interp
Negative Logits
Kush
-0.65
disclosures
-0.64
Lies
-0.61
Gutenberg
-0.60
languages
-0.58
Shal
-0.58
Tinder
-0.57
Byr
-0.57
Lobby
-0.57
Hey
-0.57
POSITIVE LOGITS
tnc
0.80
nih
0.78
MpServer
0.76
');
0.75
ubes
0.72
>]
0.72
DCS
0.71
saf
0.71
ãĤ©
0.70
ibo
0.68
Activations Density 0.417%