INDEX
Explanations
adjectives that convey quality or evaluation
New Auto-Interp
Negative Logits
ama
-0.17
ivery
-0.17
feat
-0.15
anzi
-0.14
ezi
-0.14
828
-0.14
McKay
-0.14
ico
-0.14
aku
-0.13
ourke
-0.13
POSITIVE LOGITS
reason
0.23
chance
0.21
лан
0.17
danger
0.17
limit
0.17
possibility
0.17
disconnect
0.16
GetName
0.15
Chance
0.15
likelihood
0.15
Activations Density 0.070%