INDEX
Explanations
verbs related to providing information or explanations
instances where previous points or mentions are referenced
New Auto-Interp
Negative Logits
wcs
-0.76
ãĤ¼ãĤ¦ãĤ¹
-0.73
OPE
-0.70
orest
-0.70
enez
-0.69
replica
-0.68
erate
-0.68
ctors
-0.67
ealous
-0.65
orah
-0.65
POSITIVE LOGITS
[|
0.75
Tale
0.74
Hacker
0.71
TOD
0.71
newsp
0.70
Hier
0.68
commenter
0.67
Mish
0.67
spoiler
0.65
âĺ
0.65
Activations Density 0.182%