INDEX
Explanations
phrases that emphasize the significance or importance of concepts and elements within the text
New Auto-Interp
Negative Logits
onica
-0.17
uffers
-0.15
лиÑĪком
-0.15
оваÑĢи
-0.15
ged
-0.15
991
-0.14
elop
-0.14
dings
-0.14
ulumi
-0.14
jf
-0.14
POSITIVE LOGITS
antly
0.24
ance
0.21
eous
0.18
ölçüde
0.17
/use
0.17
ially
0.16
aspect
0.16
iating
0.16
ment
0.16
unate
0.16
Activations Density 0.047%